Fall 2025
Data Science Seminar: Time Series Imputation
Lecturers: Mourad Khayati
Teaching language: English
Level: MSc students
Academic year: Fall 2025
Overview
The data science seminar involves presentations covering recent topics in data science. The area of this year’s seminar is time series imputation. As part of the seminar, we will study research papers that propose algorithms for imputing missing values. These papers present methods for reconstructing incomplete sensor data by applying various replacement strategies to estimate missing segments.
Imputation offers benefits on two levels. At the data processing level, the completed time series can be adequately utilized in a wide range of Machine Learning (ML) tasks, such as classification and forecasting. At the data management level, properly imputed time series can be more effectively stored and maintained, one reason why many Time Series Database Systems (TSDBs) have begun to incorporate native support for missing value imputation.
Structure
The goal for the students is to learn how to critically read and study research papers, describe a paper in a report, and present it in a seminar. Under supervision, students will select one paper to study and compare it with related work. This seminar aims to help students gather in-depth knowledge of an advanced topic and develop the skills required to describe a complex problem from the time series field in the form of both a presentation and a written report.
Students will also learn how to integrate an algorithm into a library, benchmark it against other algorithms, and perform the required unit tests. For this purpose, we will use ImputeGAP, a comprehensive library designed for time series imputation analysis.
IMPORTANT NOTE: The papers will be distributed on a first-come, first-served basis.
Evaluation and Expectations
The final grade depends on the quality of the report, presentation, integration quality, and active participation during the seminar. Each participant prepares a self-contained report of at least five pages and gives a presentation of 30 minutes. The report should describe the proposed algorithm in detail. The report might contain a small running example, counterexample(s), and should explore the extreme cases where the studied algorithm would perform best and worst. The reproducibility consists of reproducing the same set of experiments introduced in the paper using a different setup (dataset, metric, parameters, etc.).
Advice on how to:
IMPORTANT NOTE: Attendance is mandatory for the two-class seminar sessions. The total number of participants will be limited to 10.
Schedule
Kickoff Meeting. Date: Tue, 23.09.2025, 14:15-16:00, room: TBD
Organization of the seminar and paper assignment
----------------------------------------------------------------------
Date: Tue, 11.11.2025
Report deadline Batch 1
Date: Tue, 18.11.2025, all day, room: TBD
Office meeting with students from Batch 1
First Presentation Session. Date: Tue, 25.11.2025, 14:15-18:00, room: TBD
Presentations of Batch 1
----------------------------------------------------------------------
Date: Tue, 26.11.2025
Report deadline of Batch 2
Date: Tue, 02.12.2025, all day, room: TBD
Office meeting with students from Batch 2
Second Presentation Session. Date: Tue, 09.12.2025, 14:15-18:00, room: TBD
Presentations of Batch 2
----------------------------------------------------------------------
Date: Tue, 13.01.2026
Deadline for Final Report
Paper Assignment
The papers will be distributed on a first-come, first-served basis. To select one paper from the list of papers, please use the following link.
Paper & code | Presentation Date | Presenter |