Fall 2025
Data Science Seminar: Time Series Imputation
Lecturer: Mourad Khayati
Teaching assistant: Quentin Nater
Level: Masters
Academic year: Fall 2025
Overview
The data science seminar involves presentations covering recent topics in data science. The area of this year’s seminar is time series imputation. As part of the seminar, we will study research papers that propose algorithms for imputing missing values. These papers present methods for reconstructing incomplete sensor data by applying various replacement strategies to estimate missing segments.
Imputation offers benefits on two levels. At the data processing level, the completed time series can be adequately utilized in a wide range of Machine Learning (ML) tasks, such as classification and forecasting. At the data management level, properly imputed time series can be more effectively stored and maintained, one reason why many Time Series Database Systems (TSDBs) have begun to incorporate native support for missing value imputation.
Structure
The goal for the students is to learn how to critically read and study a research paper, describe it in a report, and present it in front of an audience. Under supervision, students will select one paper to study and compare it with related work. This seminar aims to help students gather in-depth knowledge of an advanced topic and develop the skills required to describe a complex problem from the time series field in the form of both a presentation and a written report.
Students will also learn how to integrate an algorithm into a library, benchmark it against other algorithms, and perform the required unit tests. For this purpose, we will use ImputeGAP, a comprehensive library designed for time series imputation analysis.
IMPORTANT NOTE: The papers will be distributed on a first-come, first-served basis.
Evaluation and Expectations
The final grade depends on the quality of the report, presentation, integration quality, and active participation during the seminar. Each participant prepares a self-contained report of at least five pages and gives a presentation of 30 minutes. The report should describe the proposed algorithm in detail. The report might contain a small running example, counterexample(s), and should explore the extreme cases where the studied algorithm would perform best and worst. The reproducibility consists of reproducing the same set of experiments introduced in the paper using a different setup (dataset, metric, parameters, etc.).
Advice on how to:
IMPORTANT NOTE: Attendance is mandatory for the two-class seminar sessions. The total number of participants will be limited to 16.
Schedule
Kickoff Meeting. Date: Tue, 23.09.2025, 14:15-16:00, room: E230
Organization of the seminar and paper assignment
----------------------------------------------------------------------
Date: Tue, 11.11.2025
Report deadline Batch 1
Date: Tue, 18.11.2025, all day, room: TBD
Office meeting with students from Batch 1
First Presentation Session. Date: Tue, 25.11.2025, 14:15-18:00, room: TBD
Presentations of Batch 1
----------------------------------------------------------------------
Date: Tue, 26.11.2025
Report deadline of Batch 2
Date: Tue, 02.12.2025, all day, room: TBD
Office meeting with students from Batch 2
Second Presentation Session. Date: Tue, 09.12.2025, 14:15-18:00, room: TBD
Presentations of Batch 2
----------------------------------------------------------------------
Date: Tue, 13.01.2026
Deadline for Final Report
Paper Assignment
The papers will be distributed on a first-come, first-served basis. To select one paper from the list of papers, please use the following link.
Paper | Presentation Date | Presenter |
SAITS: Self-attention-based imputation for time series, ESWA'23 | ||
CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation, NeurIPS'21 | ||
ReCTSi: Resource-efficient Correlated Time Series Imputation via Decoupled Pattern Learning and Completeness-aware Attentions, KDD'24 | ||
SSD-TS: Exploring the potential of linear state space models for diffusion models in time series imputation, KDD'25 | ||
DIM-SUM: Dynamic IMputation for Smart Utility Management, VLDB'25 | ||
Gp-vae: Deep probabilistic time series imputation, AISTATS'20 | ||
Networked Time Series Imputation via Position-aware Graph Enhanced Variational Autoencoders, KDD'23 | ||
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis, ICLR'23 | ||
Mining of Switching Sparse Networks for Missing Value Imputation in Multivariate Time Series, KDD'24 | ||
Missing Value Imputation for Multi-attribute Sensor Data Streams via Message Propagation, VLDB'24 | ||
Missing Value Imputation on Multidimensional Time Series, VLDB'21 | ||
PriSTI: A Conditional Diffusion Framework for Spatiotemporal Imputation, ICDE'23 | ||
Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks, ICLR'22 | ||
BayOTIDE: Bayesian Online Multivariate Time Series Imputation with Functional Decomposition, ICML'24 | ||
Biased Temporal Convolution Graph Network for Time Series Forecasting with Missing Values, ICLR 2024 | ||
NuwaTS: a Foundation Model Mending Every Incomplete Time Series, arXiv'24 |
Papers marked with blue color require an additional compatibility adaptation and are exempt from the reproducibility task.