Check out all news in the archive

News

July 2025 – ImputeGAP is out

ImputeGAP library v1.1.1 is officially released! [docs]

2025-07-03

April 2025 – ICDE Paper

A full research paper on model selection for time series data accepted to ICDE 2025 [pdf]

2025-04-02

December 2024 – VLDB Journal Paper

A survey of multimodal event detection based on data fusion published at VLDBJ 2024 [PDF]

2024-12-15

August 2024 – VLDB Demo

SEER: An End-to-End Toolkit for Benchmarking Time Series Database Systems in Monitoring Applications presented at VLDB 2024 PDF GUI

2024-08-30

July 2024 – Full paper accepted at MICRO 2024

Paper accepted at the 57th IEEE/ACM International Symposium on Microarchitecture: BABOL: A Sofware-Defined NAND Flash Controller (PDF coming soon)

2024-07-17

June 2024 – Two papers presented at SDS2024 and VLDB Demo

Two short papers presented at SDS2024: Cleaning Semi-Structured Errors in Open Data Using Large Language Models (preprint pdf) and Leveraging Pre-Trained Extreme Multi-Label Classifiers for Zero-Shot Learning (preprint pdf)
and SEER: An End-to-End Toolkit for Benchmarking Time Series Database Systems in Monitoring Applications accepted to VLDB 2024

2024-06-01

May 2024 – One Paper Presented at WWW and One Vision Paper Accepted at VLDB

Research Paper presented at the 33rd International World Wide Web Conference : Follow the Path: Hierarchy-Aware Extreme Multi-Label Completion for Semantic Text Tagging pdf
Vision Paper accepted at the 50th International Conference on Very Large Databases: CXL and the Return of Scale-Up Database Engines pdf

2024-05-15

April 2024 – Two articles published in TKDE

Two new research articles published in Transactions on Knowledge and Data Engineering:

Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link Prediction (preprint pdf)
Fast and Slow Thinking: A Two-Step Schema-Aware Approach for Instance Completion in Knowledge Graphs (preprint pdf)

2024-04-15

October 2023 – CIDR Paper Accepted

Resarch Paper accepted to the 14th International Conference on Innovative Data Systems Research (CIDR’24): Database Kernels: Seamless Integration of Database Systems and Fast Storage via CXL pdf

2023-10-06

August 2023 – VLDB Paper Accepted

Resarch Paper accepted to the 49th International Conference on Very Large Data Bases: TSM-Bench: Benchmarking Time Series Database Systems for Monitoring Applications pdf

2023-08-02

May 2023 – VLDB Tutorial Accepted

Tutorial in partnership with Intel, Microsoft, Google, and TU Darmstadt accepted for publication in 49th International Conference on Very Large Data Bases: Databases on Modern Networks: A Decade of Research that now comes into Practice. pdf

2023-05-28

April 2023 – WWW 2023 Paper

TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching presented at the ACM Web Conference 2023 in Austin [preprint pdf]

2023-04-30

March 2023 – SIGMOD Paper Accepted

Paper accepted for publication in ACM SIGMOD/PODS International Conference on Management of Data: GraphINC: Graph Pattern Mining at Network Speed. preprint pdf

2023-03-28

December 2022 – 3 Papers Presented at IEEE Big Data

3 papers presented at IEEE Big Data 2022 in Japan: Typhon: Parallel Transfer on Heterogeneous Datasets for Cancer Detection, ParaGraph: Mapping Wikidata Tail Entities to Wikipedia Paragraphs, and Leveraging Knowledge Graph Embeddings to Disambiguate Author Names in Scientific Data!

2022-12-02

September 2022 – Two TKDE Papers

Two papers accepted for publication in Transactions on Knowledge and Data Engineering: Nessy: a Neuro-Symbolic System for Label Noise Reduction [preprint pdf] and Human-in-the-Loop Rule Discovery for Micropost Event Detection [preprint pdf]. Congrats Alisa & Akansha!

2022-09-28

July 2022 – DiBB Distributed Black-Box Optimization Framework

Our groundbreaking Distributed Black-Box Optimization framework, DiBB, was presented at GECCO 2022 in Boston! [PDF] [github]

2022-07-07

May 2022 – VLDB

DBMS Annihilator: A High-Performance Database Workload Generator in Action, demo paper accepted at the Forty-Eighth International Conference on Very Large Data Bases [VLDB 2022]

Full pdfs (& more) soon on https://exascale.info.

2022-05-30

March 2022 – SIGMOD

X-SSD: A Storage System with Native Support for Database Logging and Replication, research paper accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2022] [PDF]

2022-03-08

December 2021 – CIDR

D-RDMA: Bringing Zero-Copy RDMA to Database Systems, research paper accepted at the Twelfth Conference on Innovative Data Systems Research [CIDR 2022] [PDF]

2021-01-01

April 2021 – VLDB and SIGMOD

In-Network Support for Transaction Triaging, research paper accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021], and
Not Your Grandpa’s SSD: The Era of Co-Designed Storage Devices tutorial accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2021]

Full pdfs (& more) soon on https://exascale.info.

2021-01-01

January 2021 – three full papers accepted at WWW 2021!

Three full research papers accepted at TheWebConf (WWW2021)! (acceptance 20%)

Peer Grading the Peer Reviews: A Dual-Role Approach for Lightening Scholarly Paper Review Processes,
RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs, and
Wiki2Prop: A Multi-Modal Approach for Predicting Wikidata Properties from Wikipedia.

Full pdfs (& more) soon on https://exascale.info.

2021-01-01

December 2020 – VLDB & AAAI

ORBITS: Online Recovery of Missing Values in Multiple Time Series Streams [pdf] accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021]
and MARTA: Integrating Human Rationales for Explainable Text Classification [pdf] accepted at the Thirty-Fifth AAAI Conference on Artificial Intelligence [AAAI 2021] (acceptance: 21%).

2020-12-01

October 2020 – BigData & CIKM

The Best of Both Worlds: Context-Powered Word Embedding Combinations for Longitudinal Text Analysis [pdf] as well as Hydra: Cancer Detection Leveraging Multiple Heads and Heterogeneous Datasets [pdf] accepted at IEEE BigData 2020
and the ACM International Conference on Information and Knowledge Management [CIKM 2020] is taking place for the very first time online! (Prof. Cudré-Mauroux is PC Chair of the conference this year).

2020-10-01

September 2020 – Most Reproducible Paper Award

We are honored to have been granted the Most Reproducible Paper Award at the VLDB 2020 Conference. Congratulations Mourad Khayati, Alberto Lerner, Zakhar Tymchenko, and Philippe Cudré-Mauroux. Our paper is available here.

2020-09-01

June 2020 – Best Paper Award

We won the Best Paper Award at the Swiss Data Science Conference (SDS 2020)! Thumbs up to our master student Yasamin Eslahi, my two Ph.D. students Paolo Rosso and Akansha Bhardwaj and our great collaborator Kurt Stockinger. Our paper is available here.

2020-06-01

Mai 2020 – Paper published in TKDE

LBSN2Vec++: Heterogeneous Hypergraph Embedding for Location-Based Social Networks [PDF] published in IEEE Transactions on Knowledge and Data Engineering (TKDE).

2020-05-01

April 2020 – WWW, SDS & IJCAI

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [PDF] [slides] and OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [PDF] [video] presented at WWW 2020;
Annotating Web Tables through Knowledge Bases: A Context-Based Approach accepted at SDS 2020, and Location Prediction over Sparse User Mobility Traces using RNNs accepted at IJCAI 2020 (acceptance: 12%). Congrats everyone!

2020-03-01

March 2020 – IEEE Data Engineering Bulletin

Networking and Storage: The Next Computing Elements in Exascale Systems? [PDF] pulished in the March 2020 Data Management at Exascale issue of the IEEE Data Engineering Bulletin.

2020-03-01

February 2020 – AAAI 2020

A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection [preprint pdf] presented at AAAI 2020 in New York.

2020-02-01

January 2020 – The Web Conf. and CIDR

Two full research papers accepted at The Web Conf. (WWW2020) (acceptance: 19%):

OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [preprint pdf] and
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [preprint pdf].

And It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives presented at CIDR 2020 [pdf] [slides].

2020-01-01

December 2019 – BigData and VLDB

Four papers presented at IEEE BigData 2019:

2019-12-01

November 2019 – ICTAI, KAIS, AAAI

Fusing Vector Space Models for Domain-Specific Applications presented at ICTAI 2019
Scalable recovery of missing blocks in time series with high and low cross-correlations published in Knowledge and Information Systems and
A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection accepted at AAAI 2020 [acceptance: 20%]

2019-11-01

October 2019 – CIDR, TKDE & ISWC

It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives accepted at CIDR 2020
Event Detection on Microposts: a Comparison of Four Approaches published in TKDE and
Non-Parametric Class Completeness Estimators for Knowledge Graphs presented at ISWC 2019

2019-10-01

August 2019 – ISWC, KDD & Graph Embeddings

Non-Parametric Class Completeness Estimators for Knowledge Graphs accepted at ISWC 2019 [acceptance: 21%]
NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching presented at KDD 2019 [acceptance: 14%]
And slides from our recent graph embeddings invited talk uploaded.

2019-08-01

May 2019 – Best Paper Award & WWW2019

Our 6 Neurons paper wins the Best Paper Award at AAMAS2019!
And 3 papers presented at WWW2019 in San Francisco: Scalpel-CD for Debugging Noisy Training Data, Deep Active Learning for Link Prediction, and Hypergraph Embeddings for LBSNs.

2019-05-01

April 2019 – KDD, TSC & AAMAS

NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching accepted at KDD 2019 [acceptance: 14%]
Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered Systems published in ACM TSC and
Playing Atari with Six Neurons nominated for best paper award at AAMAS 2019!

2019-04-01

March 2019 – 3 New Journal Papers

Three new journal papers accepted:

2019-03-01

February 2019 – Neurons, Clusters & Accelerated Queries

Playing Atari with Six Neurons accepted at AAMAS 2019. Accuracy Evaluation of Overlapping and Multi-Resolution Clustering Algorithms on Large Datasets presented at BigComp 2019, and The Case For Network-Accelerated Query Processing presented at CIDR 2019.

2019-02-01

January 2019 – 3 research papers accepted at WWW2019

3 papers accepted at the Web Conf (WWW2019)! ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs, Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach, and Scalpel-CD: Leveraging Crowdsourcing and Deep Probabilistic Modeling for Debugging Noisy Training Data, all accepted as full research papers [acceptance: 18%]. PDFs coming soon. See you in San Francisco!

2019-01-01

November 2018 – Paper Presentations

A Force-Directed Approach for Offline GPS Trajectory Map Matching presented at SIGSPATIAL 2018 [pdf][ppt] (joint work w/ heig-vd). And Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures presented at ICDM 2018 [pdf].

2018-11-01

October 2018 – Two New Papers

Are Meta-Paths Necessary? Revisiting Heterogeneous Graph Embeddings [pdf][ppt] presented at CIKM 2018 (acceptance 17%) and The Case For Network Accelerated Query Processing [pdf] accepted at CIDR 2019

2018-10-01

September 2018 – Accepted Papers

Back with more monthly news: A Force-Directed Approach for Offline GPS Trajectory Map Matching accepted at SIGSPATIAL 2018. And Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures accepted at ICDM 2018. Full info and PDFs coming soon…

2018-09-01

September 2017 – Two New Papers & Two New Members

Two new papers accepted: HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms (ICDM 2017 [acceptance 9%]) and Efficient Document Filtering Using Vector Space Topic Expansion (CIKM 2017 [acceptance 21%]). Also, Julien and Akansha join our lab; welcome!

2017-09-01

June 2017 – XI Alumns Are Taking Over the (IT) World

New batch of XI graduates are leaving our lab: Djellel joins the Data Science team at NYU, Michele goes to Stanford, Alberto left for the brand new Swisscom AI Lab and Ruslan joins Apple (GraphLab team!). Good luck to everyone!

2017-06-28

April 2017 – Journal Papers & New Members

Two new journal papers accepted: Storing, Tracking, and Querying Provenance Linked Data (Transactions on Knowledge and Data Engineering) and Managing Big Interval Data with CINTIA the Checkpoint INTerval Array (Transactions on Big Data). Also, Rana, Inès and Giuse join our lab; welcome!

2017-04-01

March 2017 – Ph.D. & Postdoc Positions in Big Data

We have a number of new open positions, both for Ph.D. Candidates and for Postdoctoral Researchers. Interested in building the future of Big Data & Data Science? Please apply!

2017-03-01

January 2017 – CIDR & WWW 2017

Dependency-driven analytics: a compass for uncharted data oceans [pdf] [slides] [Morning Paper summary] presented at CIDR 2017. And Predicting the Success of Online Petitions Leveraging Multidimensional Time-Series accepted at WWW 2017 (acceptance rate 17%). Hurray!

2017-01-01

December 2016 – PhD Award for Roman

Roman wins the JMCS PhD Award for his dissertation. Congrats!

2016-12-01

November 2016 – Alisa & Paolo

Alisa and Paolo join our team—welcome!

2016-11-01

October 2016 – VoldemortKG presented @ISWC

VoldemortKG: Mapping Schema.org and Web Entities to Linked Open Data [pdf] [slides] presented at ISWC

2016-10-01

September 2016 – PrivCheck presented @UbiComp

PrivCheck: Privacy-Preserving Check-in Data Publishing for Personalized Location Based Services [pdf] presented at UBICOMP

2016-09-01

August 2016 – POIsketch presented @IJCAI

POIsketch: Semantic Place Labeling over User Activity Streams [pdf] presented at IJCAI in New York

2016-08-01

April 2016 – IJCAI Paper, WWW Presentation, and Open Positions!

POIsketch Semantic Place Labeler accepted at IJCAI [preprint pdf], slides from our WWW presentation on Crowd Scheduling, and a number of open Ph.D. positions!

2016-04-01

February 2016 – ERC Grant

Phil wins a 2 million euros ERC Grant, the much coveted pan-European grant for frontier research! Press releases: fr/de/en.

2016-02-01

January 2016 – New Paper Accepted @ JWS; MSc and PhD Awards!

Contextualized Ranking of Entity Types Based on Knowledge Graphs accepted for publication @ JWS, Special Issue on Knowledge Graphs [preprint pdf]. Djellel wins a prize for his PhD thesis on Hybrid Crowdsourcing Systems and Laura wins 2 prizes for her MSc on Anomaly Detection using Spark. Congrats!

2016-01-01

December 2015 – New Paper Accepted @ WWW2016 and New Website

Scheduling Human Intelligence Tasks In Multi-Tenant Crowd-Powered Systems accepted for publication @ WWW2016 [acceptance: 16%] [pdf]. And we have a brand new website! Check it out here: exascale.info.

2015-12-01

October 2015 – DiploCloud Accepted @ TKDE and 2 Big Events

DiploCloud (our RDF cloud solution) accepted for publication in TKDE [preprint pdf]! And two upcoming events not to be missed: the Swissnex Big Data Day in Fribourg and the International Conference on Web Engineering in Lugano.

2015-10-01

September 2015 – Two Papers Accepted @ BigData

Two new papers accepted at IEEE BigData 2015: Online Anomaly Detection over Big Data Streams [pdf] and CINTIA: a Distributed, Low-Latency Index for Big Interval Data [pdf] [acceptance: 17%].

2015-09-01

August 2015 – Continuous Evaluation & Keynote

Pooling-Based Continuous Evaluation of Information Retrieval Systems accepted for publication in Information Retrieval; Phil’s keynote @ ICDAR 2015 on Entity-Centric Data Management is now available.

2015-08-01

June 2015 – Papers Accepted @ ISWC & VLDB; Best Thesis Award for Dingqi

SANAPHOR: Ontology-Based Coreference Resolution accepted @ ISWC 2015 (acceptance 22%); A Demonstration of TripleProv accepted @ VLDB 2015; and our newest member, Dingqi, wins the Best Thesis Award @ Samovar/Télécom SudParis–congrats!

2015-06-01

May 2015 – Big Splash @ WWW2015 and Time Series

XI makes a big splash at WWW2015 (blog post, slides, slides, slides, slides)! and Using Lowly Correlated Time Series to Recover Missing Values in Time Series accepted at SSTD 2015 (joint work w/ UZH; pdf).

2015-05-01

April 2015 – LDOW Papers and Big Bio Data

Uduvudu: an Adaptive UI Engine for Linked Data (pdf) and Fixing Linked Data by Context Disambiguation (pdf) accepted at LDOW 2015. And we make our first foray into Big Bio Data with 3DBG accepted for publication in Nucleic Acids Research (joint work w/ McGill University)!

2015-04-01

March 2015 – Managing URIs on the Web of Data & Stonebraker's Award

A Comparison of Data Structures to Manage URIs on the Web of Data accepted at ESWC 2015 (acceptance 23%). And our friend from MIT Mike Stonebraker wins the Turing Award! Huge.

2015-03-01

February 2015 – BenchPress & New Member

BenchPress (joint work w/ CMU & Microsoft) demo accepted at Sigmod; Phil is Area Editor for JWS and Group Leader @ Sigmod 2016; and Dr. Mourad Khayati joins us from UZH. Welcome!

2015-02-01

January 2015 – Two Papers Accepted at WWW 2015

Best way to start 2015: two research papers accepted at the 24th International World Wide Web Conference [acceptance: 14%] ! Executing Provenance-Enabled Queries over Web Data and The Dynamics of Micro-Task Crowdsourcing – The Case of Amazon MTurk . Full PDFs coming soon…

2015-01-01

October 2014 – Fully Funded Postdoc Position

The eXascale Infolab (U. of Fribourg–Switzerland) is hiring! We are looking for a highly qualified postdoctoral researcher in Computer Science interested in designing and developing novel information infrastructures to manage big data. See full job description here.

2014-10-01

September 2014 – Paper Accepted at BigData 2014

New Smarter Cities paper: TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse Dictionary Compression accepted at IEEE BigData 2014! [acceptance rate: 18%]. Joint work w/ IBM Research. Details here: https://exascale.info/node/286

2014-09-01

August 2014 – Papers Accepted at CIKM and VLDB

Our paper on fixing grammatical errors using large N-grams corpora and preposition ranking has been accepted at CIKM (IR track)! Also, TransactiveDB has been accepted at PVLDB. PDFs coming soon…

2014-08-01

July 2014 – Scaling-up the Crowd Accepted at HCOMP

Scaling-up the Crowd: Micro-Task Pricing Schemes for Worker Retention and Latency Improvement accepted at HCOMP 2014. See you in Pittsburgh this Fall!

2014-07-01

June 2014 – Verisign Distinguished Speaker Series and Journal Papers

Phil was invited for a second time to Verisign’s Distinguished Speaker Series. Also, two journal papers accepted: B-hist: Entity-centric search over personal web browsing history (Journal of Web Semantics) and the Entity Registry System (ERCIM News).

2014-06-01

May 2014 – Big Data Slides

The slides from our CUSO Seminar on Big Data are now available online (see our blog post for details)!

2014-05-01

April 2014 – WWW Presentations

Our WWW2014 presentations are now available online: NER, TripleProv, and Transactive Search. Also, we wrote a blog post with some of the highlights of the conference.

2014-04-01

February 2014 – Transactive Search paper accepted at WWW... and new blog

Our Transactive Search paper has also been accepted at the International World Wide Web Conference (WWW2014 Web Science track)! Also, we now have a blog, check it out here: exascale.info/blog/.

2014-02-01

January 2014 – Two Papers Accepted at WWW

Best way to start this new year: two XI papers accepted at WWW 2014! Effective Named Entity Recognition for Idiosyncratic Web Collections, and TripleProv: Efficient Processing of Lineage Queries over a Native RDF Store [acceptance rate 12.9%].

2014-01-01

December 2013 – Verisign Labs Distinguished Speakers Series

Slides from Prof. Cudré-Mauroux’s talk on ERS (the Entity Registry System) are online on slideshare [Verisign Labs Distinguished Speakers Series, Internet Infrastructures Grant Winner]

2013-12-01

October 2013 – OLTP Benchmark & TRank

OLTPBench (joint work w/ CMU and Microsoft) accepted at PVLDB [pdf]! Also, TRank goes open-source and is nominated for best-paper at ISWC [pdf, ppt, github].

2013-10-01

August 2013 – Google Award, Big Data & Ruslan

We won a Google Faculty Research Award [Press release: French / German]. Also, our joint paper w/ Verisign on High Velocity Streams has been accepted at IEEE BigData 2013. Last but not least, Ruslan joins us from Yandex. Welcome!

2013-08-01

July 2013 – Two Papers Accepted at ISWC

Two new papers accepted at ISWC 2013: TRank: Ranking Entity Types Using the Web of Data [Research Track] and NoSQL Databases for RDF: An Empirical Evaluation [Evaluation Track]. See you soon in Sydney!

2013-07-01

June 2013 – New VLDB J. and Internet Computing Articles

Our new articles on Large-Scale Linked Data Integration Using Probabilistic Reasoning and Crowdsourcing and on Scalable Anomaly Detection for Smart City Infrastructure Networks [joint work w/ the IBM Research Smarter Cities Centre] have been accepted for publication by the VLDB Journal and by IEEE Internet Computing.

2013-06-01

May 2013 – Heidelberg Laureate Forum & Crowdsourcing Tutorial

Both Gianluca and Martin got selected to attend the Heidelberg Laureate Forum. Thirty-eight Abel, Fields and Turing award winners will [also] be there!

Also, Gianluca’s tutorial slides on Crowdsourcing for the Semantic Web are available.

2013-05-01

February 2013 – Paper Accepted at WWW2013

New paper accepted at WWW2013 (acceptance:15%): Pick-A-Crowd: Tell Me What You Like, and I’ll Tell You What to Do.

2013-02-01

January 2013 – CIDR Presentation

Gianluca presents his CrowdQ paper at CIDR; Blog Post about it, ppt slides.

2013-01-01

December 2012 – Paper Accepted at ECIR

New ScienceWise paper on Ontology-Based Word Sense Disambiguation for Scientific Literature accepted at ECIR 2013.

2012-12-01

November 2012 – New XI Member

Dr. Martin Grund from HPI joins us, supported by a generous research grant from SAP. Welcome Martin!

2012-11-01

September 2012 – Swiss Computer Science Prize

Prof. Cudré-Mauroux won the Swiss National Center 2001-2012 Research in Computer Science award. Link to the MIC event this September.

2012-09-01

August 2012 – Upcoming Events

Two exciting events on the horizon: ISWC 2012 in Boston and DESWEB 2013 in Brisbane (prof. Cudré-Mauroux co-chairs both of them). Don’t miss them!

2012-08-01

July 2012 – Verisign Research Grant

We are ecstatic to have won one of the two global research grants from Verisign Inc. Press release here.

2012-07-01

June 2012 – Swiss Transportations Android App

Living in Switzerland? Then don’t miss Roman’s Android app offering timetables for Swiss public transportations. Available for free on Google Play.

2012-06-01

April 2012 – For a Few Papers More

Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval accepted at SIGIR. An overview of HYRISE in IEEE Data Eng. Bull. Downscaling Entity Registries with VUA and Verisign at DOWNSCALE. Graph Data Management Techniques for the large-scale deployment of Semantic Web technologies invited paper at GDM.

2012-04-01

March 2012 – Presentations

Fresh arrival, slide decks on the menu: Entity-Centric Data Management (presentation given at MIT CSAIL, DERI, and IBM Research Smarter Cities Centre); tutorial on Semantic Search (by Demartini et al. at ECIR); tutorial on Scalability in Semantic Data Management (Schloss Dagstuhl).

2012-03-01

February 2012 – OLTPBench

Want to benchmark relational or cloud databases? Here is our one-stop, open-source solution:OLTPBench. We hope you’ll like it as much as we do! This is joint work w/ Carlo Curino [Yahoo! Research] and Andy Pavlo [Brown].

2012-02-01

January 2012 – ZenCrowd Accepted at WWW

Our latest foray into Online Entity territory was accepted at the World Wide Web conference! [acceptance rate: 12%]

ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking

Abstract:

We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.

Gianluca Demartini, Djellel Eddine Difallah, Philippe Cudré-Mauroux
21st International World Wide Web Conference (WWW2012), Lyon (France), April 16-20, 2012.

Links to the ZenCrowd page and to the pdf of the paper.

2012-01-01