RecovDB

RecovDB is a relational database system enhanced with advanced matrix decomposition technology for missing blocks recovery.

Overview

RecovDB shows how to extend a relational database to support recovery of missing blocks in large time series data. Our approach represents the input time series as a relation and maps them into loading and relevance vectors that best account for the correlation. The loading vectors (L) expose the rank of the matrix which is used to accurately recover the missing blocks. The time series mapping is efficiently computed using our memory-efficient Centroid Decomposition (CD) technique. The recovery algorithm has been tightly integrated into the open-source analytical RDBMS MonetDB as native User Defined Functions (UDFs). RecovDB offers the following salient features:

  • Parameter free and correlation-aware recovery
  • Recovery of large missing blocks in multiple time series
  • Full-fledged DBMS support

Code:

The source code is available on GitHub. It contains compilation and installation instructions as well as sample recovery querires through SQL Python-UDFs.

Graphical UI:

Description:

RecovDB is also available as a GUI through the ReVival tool. The latter allows users to perform batch and online recovery of missing blocks on real-world time series data. Users can select one or multiple time series from a particular dataset, drop a percentage of data from the selected time series, and then recover these missing values.

Example:

Input: Three water discharge time series each of them has a missing block.

Query: Recover all the incomplete time series in one pass.

Result: The following figure illustrates the result of the recovery. The missing values are shown in dashed lines while the recovered blocks are shown in red dashed lines.

Research papers:

Talks:

  • Efficient and Accurate Time Series Imputation using Correlation, at CWI 2018 (Amsterdam, the Netherlands).

Research Team: