Check out all news in the archive


July 2024 – Full paper accepted at MICRO 2024

June 2024 – Two papers presented at SDS2024 and VLDB Demo

  • Two short papers presented at SDS2024: Cleaning Semi-Structured Errors in Open Data Using Large Language Models (preprint pdf) and Leveraging Pre-Trained Extreme Multi-Label Classifiers for Zero-Shot Learning (preprint pdf)
  • and SEER: An End-to-End Toolkit for Benchmarking Time Series Database Systems in Monitoring Applications accepted at VLDB 2024 (pdf coming soon)

May 2024 – One Paper Presented at WWW and One Vision Paper Accepted at VLDB

April 2024 – Two articles published in TKDE

Two new research articles published in Transactions on Knowledge and Data Engineering:

  • Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link Prediction (preprint pdf)
  • Fast and Slow Thinking: A Two-Step Schema-Aware Approach for Instance Completion in Knowledge Graphs (preprint pdf)

October 2023 – CIDR Paper Accepted

August 2023 – VLDB Paper Accepted

May 2023 – VLDB Tutorial Accepted

April 2023 – WWW 2023 Paper

March 2023 – SIGMOD Paper Accepted

December 2022 – 3 Papers Presented at IEEE Big Data

September 2022 – Two TKDE Papers

July 2022 – DiBB Distributed Black-Box Optimization Framework

  • Our groundbreaking Distributed Black-Box Optimization framework, DiBB, was presented at GECCO 2022 in Boston! [PDF] [github]

May 2022 – VLDB

  • DBMS Annihilator: A High-Performance Database Workload Generator in Action, demo paper accepted at the Forty-Eighth International Conference on Very Large Data Bases [VLDB 2022]

Full pdfs (& more) soon on

March 2022 – SIGMOD

  • X-SSD: A Storage System with Native Support for Database Logging and Replication, research paper accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2022] [PDF]

December 2021 – CIDR

  • D-RDMA: Bringing Zero-Copy RDMA to Database Systems, research paper accepted at the Twelfth Conference on Innovative Data Systems Research [CIDR 2022] [PDF]

April 2021 – VLDB and SIGMOD

  • In-Network Support for Transaction Triaging, research paper accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021], and
  • Not Your Grandpa’s SSD: The Era of Co-Designed Storage Devices tutorial accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2021]

Full pdfs (& more) soon on

January 2021 – three full papers accepted at WWW 2021!

Three full research papers accepted at TheWebConf (WWW2021)! (acceptance 20%)

  1. Peer Grading the Peer Reviews: A Dual-Role Approach for Lightening Scholarly Paper Review Processes,
  2. RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs, and
  3. Wiki2Prop: A Multi-Modal Approach for Predicting Wikidata Properties from Wikipedia.

Full pdfs (& more) soon on

December 2020 – VLDB & AAAI

  • ORBITS: Online Recovery of Missing Values in Multiple Time Series Streams [pdf] accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021]
  • and MARTA: Integrating Human Rationales for Explainable Text Classification [pdf] accepted at the Thirty-Fifth AAAI Conference on Artificial Intelligence [AAAI 2021] (acceptance: 21%).

October 2020 – BigData & CIKM

  • The Best of Both Worlds: Context-Powered Word Embedding Combinations for Longitudinal Text Analysis [pdf] as well as Hydra: Cancer Detection Leveraging Multiple Heads and Heterogeneous Datasets [pdf] accepted at IEEE BigData 2020
  • and the ACM International Conference on Information and Knowledge Management [CIKM 2020] is taking place for the very first time online! (Prof. Cudré-Mauroux is PC Chair of the conference this year).

September 2020 – Most Reproducible Paper Award

We are honored to have been granted the Most Reproducible Paper Award at the VLDB 2020 Conference. Congratulations Mourad Khayati, Alberto Lerner, Zakhar Tymchenko, and Philippe Cudré-Mauroux. Our paper is available here.

June 2020 – Best Paper Award

We won the Best Paper Award at the Swiss Data Science Conference (SDS 2020)! Thumbs up to our master student Yasamin Eslahi, my two Ph.D. students Paolo Rosso and Akansha Bhardwaj and our great collaborator Kurt Stockinger. Our paper is available here.

Mai 2020 – Paper published in TKDE

LBSN2Vec++: Heterogeneous Hypergraph Embedding for Location-Based Social Networks [PDF] published in IEEE Transactions on Knowledge and Data Engineering (TKDE).

April 2020 – WWW, SDS & IJCAI

  • Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [PDF] [slides] and OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [PDF] [video] presented at WWW 2020;
  • Annotating Web Tables through Knowledge Bases: A Context-Based Approach accepted at SDS 2020, and Location Prediction over Sparse User Mobility Traces using RNNs accepted at IJCAI 2020 (acceptance: 12%). Congrats everyone!

March 2020 – IEEE Data Engineering Bulletin

Networking and Storage: The Next Computing Elements in Exascale Systems? [PDF] pulished in the March 2020 Data Management at Exascale issue of the IEEE Data Engineering Bulletin.

February 2020 – AAAI 2020

A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection [preprint pdf] presented at AAAI 2020 in New York.

January 2020 – The Web Conf. and CIDR

Two full research papers accepted at The Web Conf. (WWW2020) (acceptance: 19%):

  • OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [preprint pdf] and
  • Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [preprint pdf].

And It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives presented at CIDR 2020 [pdf] [slides].

December 2019 – BigData and VLDB

November 2019 – ICTAI, KAIS, AAAI

October 2019 – CIDR, TKDE & ISWC

August 2019 – ISWC, KDD & Graph Embeddings

May 2019 – Best Paper Award & WWW2019

April 2019 – KDD, TSC & AAMAS

March 2019 – 3 New Journal Papers

February 2019 – Neurons, Clusters & Accelerated Queries

January 2019 – 3 research papers accepted at WWW2019

3 papers accepted at the Web Conf (WWW2019)! ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs, Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach, and Scalpel-CD: Leveraging Crowdsourcing and Deep Probabilistic Modeling for Debugging Noisy Training Data, all accepted as full research papers [acceptance: 18%]. PDFs coming soon. See you in San Francisco!

November 2018 – Paper Presentations

A Force-Directed Approach for Offline GPS Trajectory Map Matching presented at SIGSPATIAL 2018 [pdf][ppt] (joint work w/ heig-vd). And Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures presented at ICDM 2018 [pdf].

October 2018 – Two New Papers

Are Meta-Paths Necessary? Revisiting Heterogeneous Graph Embeddings [pdf][ppt] presented at CIKM 2018 (acceptance 17%) and The Case For Network Accelerated Query Processing [pdf] accepted at CIDR 2019

September 2018 – Accepted Papers

Back with more monthly news: A Force-Directed Approach for Offline GPS Trajectory Map Matching accepted at SIGSPATIAL 2018. And Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures accepted at ICDM 2018. Full info and PDFs coming soon…

September 2017 – Two New Papers & Two New Members

Two new papers accepted: HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms (ICDM 2017 [acceptance 9%]) and Efficient Document Filtering Using Vector Space Topic Expansion (CIKM 2017 [acceptance 21%]). Also, Julien and Akansha join our lab; welcome!

June 2017 – XI Alumns Are Taking Over the (IT) World

New batch of XI graduates are leaving our lab: Djellel joins the Data Science team at NYU, Michele goes to Stanford, Alberto left for the brand new Swisscom AI Lab and Ruslan joins Apple (GraphLab team!). Good luck to everyone!

April 2017 – Journal Papers & New Members

Two new journal papers accepted: Storing, Tracking, and Querying Provenance Linked Data (Transactions on Knowledge and Data Engineering) and Managing Big Interval Data with CINTIA the Checkpoint INTerval Array (Transactions on Big Data). Also, Rana, Inès and Giuse join our lab; welcome!

March 2017 – Ph.D. & Postdoc Positions in Big Data

We have a number of new open positions, both for Ph.D. Candidates and for Postdoctoral Researchers. Interested in building the future of Big Data & Data Science? Please apply!

January 2017 – CIDR & WWW 2017

Dependency-driven analytics: a compass for uncharted data oceans [pdf] [slides] [Morning Paper summary] presented at CIDR 2017. And Predicting the Success of Online Petitions Leveraging Multidimensional Time-Series accepted at WWW 2017 (acceptance rate 17%). Hurray!

December 2016 – PhD Award for Roman

Roman wins the JMCS PhD Award for his dissertation. Congrats!

November 2016 – Alisa & Paolo

Alisa and Paolo join our team—welcome!

October 2016 – VoldemortKG presented @ISWC

VoldemortKG: Mapping and Web Entities to Linked Open Data [pdf] [slides] presented at ISWC

September 2016 – PrivCheck presented @UbiComp

PrivCheck: Privacy-Preserving Check-in Data Publishing for Personalized Location Based Services [pdf] presented at UBICOMP

August 2016 – POIsketch presented @IJCAI

POIsketch: Semantic Place Labeling over User Activity Streams [pdf] presented at IJCAI in New York

April 2016 – IJCAI Paper, WWW Presentation, and Open Positions!

POIsketch Semantic Place Labeler accepted at IJCAI [preprint pdf], slides from our WWW presentation on Crowd Scheduling, and a number of open Ph.D. positions!

February 2016 – ERC Grant

Phil wins a 2 million euros ERC Grant, the much coveted pan-European grant for frontier research! Press releases: fr/de/en.

January 2016 – New Paper Accepted @ JWS; MSc and PhD Awards!

Contextualized Ranking of Entity Types Based on Knowledge Graphs accepted for publication @ JWS, Special Issue on Knowledge Graphs [preprint pdf]. Djellel wins a prize for his PhD thesis on Hybrid Crowdsourcing Systems and Laura wins 2 prizes for her MSc on Anomaly Detection using Spark. Congrats!

December 2015 – New Paper Accepted @ WWW2016 and New Website

Scheduling Human Intelligence Tasks In Multi-Tenant Crowd-Powered Systems accepted for publication @ WWW2016 [acceptance: 16%] [pdf]. And we have a brand new website! Check it out here:

October 2015 – DiploCloud Accepted @ TKDE and 2 Big Events

DiploCloud (our RDF cloud solution) accepted for publication in TKDE [preprint pdf]! And two upcoming events not to be missed: the Swissnex Big Data Day in Fribourg and the International Conference on Web Engineering in Lugano.

September 2015 – Two Papers Accepted @ BigData

August 2015 – Continuous Evaluation & Keynote

June 2015 – Papers Accepted @ ISWC & VLDB; Best Thesis Award for Dingqi

May 2015 – Big Splash @ WWW2015 and Time Series

April 2015 – LDOW Papers and Big Bio Data

Uduvudu: an Adaptive UI Engine for Linked Data (pdf) and Fixing Linked Data by Context Disambiguation (pdf) accepted at LDOW 2015. And we make our first foray into Big Bio Data with 3DBG accepted for publication in Nucleic Acids Research (joint work w/ McGill University)!

March 2015 – Managing URIs on the Web of Data & Stonebraker's Award

February 2015 – BenchPress & New Member

BenchPress (joint work w/ CMU & Microsoft) demo accepted at Sigmod; Phil is Area Editor for JWS and Group Leader @ Sigmod 2016; and Dr. Mourad Khayati joins us from UZH. Welcome!

January 2015 – Two Papers Accepted at WWW 2015

Best way to start 2015: two research papers accepted at the 24th International World Wide Web Conference [acceptance: 14%] ! Executing Provenance-Enabled Queries over Web Data and The Dynamics of Micro-Task Crowdsourcing – The Case of Amazon MTurk . Full PDFs coming soon…

October 2014 – Fully Funded Postdoc Position

The eXascale Infolab (U. of Fribourg–Switzerland) is hiring! We are looking for a highly qualified postdoctoral researcher in Computer Science interested in designing and developing novel information infrastructures to manage big data. See full job description here.

September 2014 – Paper Accepted at BigData 2014

New Smarter Cities paper: TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse Dictionary Compression accepted at IEEE BigData 2014! [acceptance rate: 18%]. Joint work w/ IBM Research. Details here:

August 2014 – Papers Accepted at CIKM and VLDB

Our paper on fixing grammatical errors using large N-grams corpora and preposition ranking has been accepted at CIKM (IR track)! Also, TransactiveDB has been accepted at PVLDB. PDFs coming soon…

July 2014 – Scaling-up the Crowd Accepted at HCOMP

June 2014 – Verisign Distinguished Speaker Series and Journal Papers

Phil was invited for a second time to Verisign’s Distinguished Speaker Series. Also, two journal papers accepted: B-hist: Entity-centric search over personal web browsing history (Journal of Web Semantics) and the Entity Registry System (ERCIM News).

May 2014 – Big Data Slides

The slides from our CUSO Seminar on Big Data are now available online (see our blog post for details)!

April 2014 – WWW Presentations

Our WWW2014 presentations are now available online: NER, TripleProv, and Transactive Search. Also, we wrote a blog post with some of the highlights of the conference.

February 2014 – Transactive Search paper accepted at WWW... and new blog

Our Transactive Search paper has also been accepted at the International World Wide Web Conference (WWW2014 Web Science track)! Also, we now have a blog, check it out here:

January 2014 – Two Papers Accepted at WWW

December 2013 – Verisign Labs Distinguished Speakers Series

Slides from Prof. Cudré-Mauroux’s talk on ERS (the Entity Registry System) are online on slideshare [Verisign Labs Distinguished Speakers Series, Internet Infrastructures Grant Winner]

October 2013 – OLTP Benchmark & TRank

OLTPBench (joint work w/ CMU and Microsoft) accepted at PVLDB [pdf]! Also, TRank goes open-source and is nominated for best-paper at ISWC [pdf, ppt, github].

August 2013 – Google Award, Big Data & Ruslan

We won a Google Faculty Research Award [Press release: French / German]. Also, our joint paper w/ Verisign on High Velocity Streams has been accepted at IEEE BigData 2013. Last but not least, Ruslan joins us from Yandex. Welcome!

July 2013 – Two Papers Accepted at ISWC

Two new papers accepted at ISWC 2013: TRank: Ranking Entity Types Using the Web of Data [Research Track] and NoSQL Databases for RDF: An Empirical Evaluation [Evaluation Track]. See you soon in Sydney!

June 2013 – New VLDB J. and Internet Computing Articles

Our new articles on Large-Scale Linked Data Integration Using Probabilistic Reasoning and Crowdsourcing and on Scalable Anomaly Detection for Smart City Infrastructure Networks [joint work w/ the IBM Research Smarter Cities Centre] have been accepted for publication by the VLDB Journal and by IEEE Internet Computing.

May 2013 – Heidelberg Laureate Forum & Crowdsourcing Tutorial

Both Gianluca and Martin got selected to attend the Heidelberg Laureate Forum. Thirty-eight Abel, Fields and Turing award winners will [also] be there!

Also, Gianluca’s tutorial slides on Crowdsourcing for the Semantic Web are available.

February 2013 – Paper Accepted at WWW2013

New paper accepted at WWW2013 (acceptance:15%): Pick-A-Crowd: Tell Me What You Like, and I’ll Tell You What to Do.

January 2013 – CIDR Presentation

Gianluca presents his CrowdQ paper at CIDR; Blog Post about it, ppt slides.

December 2012 – Paper Accepted at ECIR

New ScienceWise paper on Ontology-Based Word Sense Disambiguation for Scientific Literature accepted at ECIR 2013.

November 2012 – New XI Member

Dr. Martin Grund from HPI joins us, supported by a generous research grant from SAP. Welcome Martin!

September 2012 – Swiss Computer Science Prize

Prof. Cudré-Mauroux won the Swiss National Center 2001-2012 Research in Computer Science award. Link to the MIC event this September.

August 2012 – Upcoming Events

Two exciting events on the horizon: ISWC 2012 in Boston and DESWEB 2013 in Brisbane (prof. Cudré-Mauroux co-chairs both of them). Don’t miss them!

July 2012 – Verisign Research Grant

We are ecstatic to have won one of the two global research grants from Verisign Inc. Press release here.

June 2012 – Swiss Transportations Android App

Living in Switzerland? Then don’t miss Roman’s Android app offering timetables for Swiss public transportations. Available for free on Google Play.

April 2012 – For a Few Papers More

Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval accepted at SIGIR. An overview of HYRISE in IEEE Data Eng. Bull. Downscaling Entity Registries with VUA and Verisign at DOWNSCALE. Graph Data Management Techniques for the large-scale deployment of Semantic Web technologies invited paper at GDM.

March 2012 – Presentations

Fresh arrival, slide decks on the menu: Entity-Centric Data Management (presentation given at MIT CSAIL, DERI, and IBM Research Smarter Cities Centre); tutorial on Semantic Search (by Demartini et al. at ECIR); tutorial on Scalability in Semantic Data Management (Schloss Dagstuhl).

February 2012 – OLTPBench

Want to benchmark relational or cloud databases? Here is our one-stop, open-source solution:OLTPBench. We hope you’ll like it as much as we do! This is joint work w/ Carlo Curino [Yahoo! Research] and Andy Pavlo [Brown].

January 2012 – ZenCrowd Accepted at WWW

Our latest foray into Online Entity territory was accepted at the World Wide Web conference! [acceptance rate: 12%]

ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking


We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.

Gianluca Demartini, Djellel Eddine Difallah, Philippe Cudré-Mauroux
21st International World Wide Web Conference (WWW2012), Lyon (France), April 16-20, 2012.

Links to the ZenCrowd page and to the pdf of the paper.