Fall 2024
Data Science Seminar: Model Selection
Lecturers: Mourad Khayati
Teaching language: English
Level: MSc students
Academic year: Fall 2024
Overview
The data science seminar involves presentations covering recent topics in data science. The area of this year’s seminar is model selection. In the scope of this seminar, we investigate papers that describe model selection algorithms and systems. The papers explore techniques to configure, compare, and select the best-performing model among a set of seed models. Those techniques are applied to solve
various Machine Learning (ML) tasks, including classification, forecasting, anomaly detection, etc.
Structure
The goal for the students is to learn how to critically read and study research papers, describe a paper in a report, and present it in a seminar. Under supervision, students will select one paper to study and compare it with related work. This seminar aims to help students gather in-depth knowledge of an advanced topic and develop the skills required to describe a complex problem from the time series field in the form of both a presentation, a written report, and an empirical evaluation.
IMPORTANT NOTE: The papers will be distributed on a first-come, first-serve basis.
Evaluation and Expectations
The final grade depends on the quality of the report, presentation, reproducibility experiments, and active participation during the seminar. Each participant prepares a self-contained report of min 6 pages and gives a presentation of 30 minutes. The report should describe the proposed technique in detail. The report might contain a small running example, counterexample(s), and should explore the extreme cases where the evaluated systems and algorithms would perform best and worst. The reproducibility consists of reproducing the same set of experiments introduced in the paper using a different setup (dataset, metric, parameters, etc.).
Advice on how to:
IMPORTANT NOTE: Attendance is mandatory for the two-class seminar sessions. The total number of participants will be limited to 10.
Schedule
Kickoff Meeting. Date: Tue, 24.09.2024, 14:15-16:00, room: D230
Setup and organization of the seminar and paper assignment
----------------------------------------------------------------------
Date: Tue, 12.11.2024
Report deadline Batch1
Date: Tue, 19.11.2024, all day, room: C433
Office meeting with students from Batch1
First Seminar Session. Date: Tue, 26.11.2024, 14:15-18:00, room: A303
Presentations of Batch1
----------------------------------------------------------------------
Date: Tue, 27.11.2024
Report deadline of Batch2
Date: Tue, 03.12.2024, all day, room: C433
Office meeting with students from Batch2
Second Seminar Session. Date: Tue, 10.12.2024, 14:15-18:00, room: A303
Presentations of Batch2
----------------------------------------------------------------------
Date: Tue, 14.01.2025
Deadline final Report of Batch1 and Batch2
Paper Assignment
The papers will be distributed on a first-come, first-serve basis. To select one paper from the list of papers, please use the following link.
Paper & code | Presentation Date | Presenter |
TSC-AutoML: Meta-learning for Automatic Time Series Classification Algorithm Selection, ICDE 2023 | 26.11.2024 | Artthik Sellathurai |
Choose Wisely: An Extensive Evaluation of Model Selection for Anomaly Detection in Time Series, PVLDB 2023 | 26.11.2024 | Thomas Sutter |
AutoForecast: Automatic Time-Series Forecasting Model Selection, CIKM 2022 | 26.11.2024 | Natallia Patashkevich |
Raha: A Configuration-Free Error Detection System, SIGMOD 2019 | 26.11.2024 | Prosper Ukoma Chima |
SimpleTS: An Efficient and Universal Model Selection Framework for Time Series Forecasting, PVLDB 2023 | 10.12.2024 | Matej Kutirov |
Active Model Selection for Positive Unlabeled Time Series Classification, ICDE 2020 | 10.12.2024 | Christine Groux |
AutoAI-TS: AutoAI for Time Series Forecasting, SIGMOD 2021 | 10.12.2024 | Mattias Dürrmeier |
ShrinkHPO: Towards Explainable Parallel Hyperparameter Optimization, ICDE 2024 | 10.12.2024 | Nicholas Kaegi |
AutoML in heavily constrained applications, VLDB Journal 2024 | - | - |
Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems, PVLDB 2024 | - | - |