Fall 2020

Data Science Seminar

Lecturers: Mourad Khayati, Rana Hussein

Teaching language: English

Level: MSc students

Academic year: Fall 2020

Overview

Structure

Evaluation and Expectations

Schedule

List of Papers


Overview

The seminar on data science involves presentations that cover recent topics on data science. The area of this year’s seminar is time series. In the scope of this seminar, we investigate papers that describe algorithms and techniques to perform analytics on time series data.


Structure

The goal for the students is to learn how to critically read and study research papers, how to describe a paper in a report, and how to present it in a seminar. Under supervision, students will select one paper to study and to compare with related work. This seminar aims to help students to gather in-depth knowledge of an advanced topic and develop the skills required to describe a complex problem from the time series field in the form of both a presentation, a written report, and an empirical evaluation.

IMPORTANT NOTE: The papers will be distributed on a first come first serve basis.


Evaluation and Expectations

The final grade depends on the quality of the report, presentation, reproducibility experiments, and active participation during the seminar. Each participant prepares a self-contained report of min 6 pages and gives a presentation of 30 minutes. The report should describe in detail the proposed technique(s). The report might contain a small running example, counterexample(s) if any, and should explore the extreme cases where the proposed approach would perform best and worst. The reproducibility experiments consist of reproducing the same set of experiments introduced in the paper using different datasets.

Advice on how to:

IMPORTANT NOTE: Attendance is mandatory for the two-class seminar sessions. The total number of participants will be limited to 10.


Schedule

Kickoff Meeting (Onsite). Date: Tue, 22.09.2020, 10:15-12:00, room: F207

Setup and organization of the seminar, and paper assignment

----------------------------------------------------------------------

Date: Tue, 17.11.2020
Report deadline
Batch1

Date: Tue,  24.11.2020, all day, room: TBD

Office meeting with students from Batch1

First Seminar Session. Date: Tue, 01.12.2020, 10:15-12:15, room: TBD

Presentations of Batch1

----------------------------------------------------------------------

Date: Tue, 12.01.2021
Deadline final Report of Batch1 and Batch2


Paper Assignment

The papers will be distributed on a first come first serve basis. Please use the following link to select one paper among the list of papers.

Paper & code

Presentation Date

Presenter

First Report Deadline

k-Shape: Efficient and Accurate Clustering of Time Series. SIGMOD 2015. Code: https://github.com/johnpaparrizos/kshape

01.12.2020

Guillaume Chacun

24.11.2020

Neighbor Profile: Bagging Nearest Neighbors for Unsupervised Time Series Mining. ICDE 2020. Code: https://sites.google.com/view/neighbor-profile

01.12.2020

Raphaël Margueron

24.11.2020

Learning Individual Models for Imputation. ICDE 2019. Code: https://github.com/zaqthss/icde19-iim

01.12.2020

Zakhar Tymchenko

24.11.2020

Data Series Progressive Similarity Search with Probabilistic Quality Guarantees. SIGMOD 2020. Code:  

http://helios.mi.parisdescartes.fr/~themisp/progrss/

Massively-Parallel Change Detection for Satellite Time Series Data with Missing Values. ICDE 2020. Code: https://github.com/gieseke/bfast

Efficient Learning Interpretable Shapelets for Accurate Time Series Classification. ICDE 2018. Code: https://github.com/House1993/ELIS

Scalable, Variable-Length Similarity Search in Data Series: The ULISSE Approach, VLDB 2018. Code: http://helios.mi.parisdescartes.fr/~mlinardi/ULISSE.html

The Lernaean Hydra of Data Series Similarity Search, VLDB 2019. Code: http://helios.mi.parisdescartes.fr/~themisp/dsseval/