News
August 2024 – VLDB Demo
July 2024 – Full paper accepted at MICRO 2024
- Paper accepted at the 57th IEEE/ACM International Symposium on Microarchitecture: BABOL: A Sofware-Defined NAND Flash Controller (PDF coming soon)
June 2024 – Two papers presented at SDS2024 and VLDB Demo
- Two short papers presented at SDS2024: Cleaning Semi-Structured Errors in Open Data Using Large Language Models (preprint pdf) and Leveraging Pre-Trained Extreme Multi-Label Classifiers for Zero-Shot Learning (preprint pdf)
- and SEER: An End-to-End Toolkit for Benchmarking Time Series Database Systems in Monitoring Applications accepted to VLDB 2024
May 2024 – One Paper Presented at WWW and One Vision Paper Accepted at VLDB
- Research Paper presented at the 33rd International World Wide Web Conference : Follow the Path: Hierarchy-Aware Extreme Multi-Label Completion for Semantic Text Tagging pdf
- Vision Paper accepted at the 50th International Conference on Very Large Databases: CXL and the Return of Scale-Up Database Engines pdf
April 2024 – Two articles published in TKDE
Two new research articles published in Transactions on Knowledge and Data Engineering:
- Schema-Aware Hyper-Relational Knowledge Graph Embeddings for Link Prediction (preprint pdf)
- Fast and Slow Thinking: A Two-Step Schema-Aware Approach for Instance Completion in Knowledge Graphs (preprint pdf)
October 2023 – CIDR Paper Accepted
- Resarch Paper accepted to the 14th International Conference on Innovative Data Systems Research (CIDR’24): Database Kernels: Seamless Integration of Database Systems and Fast Storage via CXL pdf
August 2023 – VLDB Paper Accepted
- Resarch Paper accepted to the 49th International Conference on Very Large Data Bases: TSM-Bench: Benchmarking Time Series Database Systems for Monitoring Applications pdf
May 2023 – VLDB Tutorial Accepted
- Tutorial in partnership with Intel, Microsoft, Google, and TU Darmstadt accepted for publication in 49th International Conference on Very Large Data Bases: Databases on Modern Networks: A Decade of Research that now comes into Practice. pdf
April 2023 – WWW 2023 Paper
- TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching presented at the ACM Web Conference 2023 in Austin [preprint pdf]
March 2023 – SIGMOD Paper Accepted
- Paper accepted for publication in ACM SIGMOD/PODS International Conference on Management of Data: GraphINC: Graph Pattern Mining at Network Speed. preprint pdf
December 2022 – 3 Papers Presented at IEEE Big Data
September 2022 – Two TKDE Papers
- Two papers accepted for publication in Transactions on Knowledge and Data Engineering: Nessy: a Neuro-Symbolic System for Label Noise Reduction [preprint pdf] and Human-in-the-Loop Rule Discovery for Micropost Event Detection [preprint pdf]. Congrats Alisa & Akansha!
July 2022 – DiBB Distributed Black-Box Optimization Framework
- Our groundbreaking Distributed Black-Box Optimization framework, DiBB, was presented at GECCO 2022 in Boston! [PDF] [github]
May 2022 – VLDB
- DBMS Annihilator: A High-Performance Database Workload Generator in Action, demo paper accepted at the Forty-Eighth International Conference on Very Large Data Bases [VLDB 2022]
Full pdfs (& more) soon on https://exascale.info.
March 2022 – SIGMOD
- X-SSD: A Storage System with Native Support for Database Logging and Replication, research paper accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2022] [PDF]
December 2021 – CIDR
April 2021 – VLDB and SIGMOD
- In-Network Support for Transaction Triaging, research paper accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021], and
- Not Your Grandpa’s SSD: The Era of Co-Designed Storage Devices tutorial accepted at the ACM SIGMOD/PODS International Conference on Management of Data [SIGMOD 2021]
Full pdfs (& more) soon on https://exascale.info.
January 2021 – three full papers accepted at WWW 2021!
Three full research papers accepted at TheWebConf (WWW2021)! (acceptance 20%)
- Peer Grading the Peer Reviews: A Dual-Role Approach for Lightening Scholarly Paper Review Processes,
- RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs, and
- Wiki2Prop: A Multi-Modal Approach for Predicting Wikidata Properties from Wikipedia.
Full pdfs (& more) soon on https://exascale.info.
December 2020 – VLDB & AAAI
- ORBITS: Online Recovery of Missing Values in Multiple Time Series Streams [pdf] accepted at the Forty-Seventh International Conference on Very Large Data Bases [VLDB 2021]
- and MARTA: Integrating Human Rationales for Explainable Text Classification [pdf] accepted at the Thirty-Fifth AAAI Conference on Artificial Intelligence [AAAI 2021] (acceptance: 21%).
October 2020 – BigData & CIKM
- The Best of Both Worlds: Context-Powered Word Embedding Combinations for Longitudinal Text Analysis [pdf] as well as Hydra: Cancer Detection Leveraging Multiple Heads and Heterogeneous Datasets [pdf] accepted at IEEE BigData 2020
- and the ACM International Conference on Information and Knowledge Management [CIKM 2020] is taking place for the very first time online! (Prof. Cudré-Mauroux is PC Chair of the conference this year).
September 2020 – Most Reproducible Paper Award
We are honored to have been granted the Most Reproducible Paper Award at the VLDB 2020 Conference. Congratulations Mourad Khayati, Alberto Lerner, Zakhar Tymchenko, and Philippe Cudré-Mauroux. Our paper is available here.
June 2020 – Best Paper Award
We won the Best Paper Award at the Swiss Data Science Conference (SDS 2020)! Thumbs up to our master student Yasamin Eslahi, my two Ph.D. students Paolo Rosso and Akansha Bhardwaj and our great collaborator Kurt Stockinger. Our paper is available here.
Mai 2020 – Paper published in TKDE
LBSN2Vec++: Heterogeneous Hypergraph Embedding for Location-Based Social Networks [PDF] published in IEEE Transactions on Knowledge and Data Engineering (TKDE).
April 2020 – WWW, SDS & IJCAI
- Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [PDF] [slides] and OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [PDF] [video] presented at WWW 2020;
- Annotating Web Tables through Knowledge Bases: A Context-Based Approach accepted at SDS 2020, and Location Prediction over Sparse User Mobility Traces using RNNs accepted at IJCAI 2020 (acceptance: 12%). Congrats everyone!
March 2020 – IEEE Data Engineering Bulletin
Networking and Storage: The Next Computing Elements in Exascale Systems? [PDF] pulished in the March 2020 Data Management at Exascale issue of the IEEE Data Engineering Bulletin.
February 2020 – AAAI 2020
A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection [preprint pdf] presented at AAAI 2020 in New York.
January 2020 – The Web Conf. and CIDR
Two full research papers accepted at The Web Conf. (WWW2020) (acceptance: 19%):
- OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [preprint pdf] and
- Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [preprint pdf].
And It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives presented at CIDR 2020 [pdf] [slides].
December 2019 – BigData and VLDB
Four papers presented at IEEE BigData 2019:
- DAOC: Stable Clustering of Large Networks
- Revisiting Text and Knowledge Graph Joint Embeddings: The Amount of Shared Information Matters!
- CORAD: Correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding and
-
Bridging the Gap between Community and Node Representations: Graph Embedding via Community Detection
- Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series accepted at VLDB 2020
November 2019 – ICTAI, KAIS, AAAI
- Fusing Vector Space Models for Domain-Specific Applications presented at ICTAI 2019
- Scalable recovery of missing blocks in time series with high and low cross-correlations published in Knowledge and Information Systems and
- A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection accepted at AAAI 2020 [acceptance: 20%]
October 2019 – CIDR, TKDE & ISWC
- It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives accepted at CIDR 2020
- Event Detection on Microposts: a Comparison of Four Approaches published in TKDE and
- Non-Parametric Class Completeness Estimators for Knowledge Graphs presented at ISWC 2019
August 2019 – ISWC, KDD & Graph Embeddings
- Non-Parametric Class Completeness Estimators for Knowledge Graphs accepted at ISWC 2019 [acceptance: 21%]
- NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching presented at KDD 2019 [acceptance: 14%]
- And slides from our recent graph embeddings invited talk uploaded.
May 2019 – Best Paper Award & WWW2019
- Our 6 Neurons paper wins the Best Paper Award at AAMAS2019!
- And 3 papers presented at WWW2019 in San Francisco: Scalpel-CD for Debugging Noisy Training Data, Deep Active Learning for Link Prediction, and Hypergraph Embeddings for LBSNs.
April 2019 – KDD, TSC & AAMAS
- NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching accepted at KDD 2019 [acceptance: 14%]
- Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered Systems published in ACM TSC and
- Playing Atari with Six Neurons nominated for best paper award at AAMAS 2019!
March 2019 – 3 New Journal Papers
Three new journal papers accepted:
February 2019 – Neurons, Clusters & Accelerated Queries
January 2019 – 3 research papers accepted at WWW2019
3 papers accepted at the Web Conf (WWW2019)! ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs, Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach, and Scalpel-CD: Leveraging Crowdsourcing and Deep Probabilistic Modeling for Debugging Noisy Training Data, all accepted as full research papers [acceptance: 18%]. PDFs coming soon. See you in San Francisco!
November 2018 – Paper Presentations
October 2018 – Two New Papers
September 2018 – Accepted Papers
Back with more monthly news: A Force-Directed Approach for Offline GPS Trajectory Map Matching accepted at SIGSPATIAL 2018. And Clubmark: a Parallel Isolation Framework for Benchmarking and Profiling Clustering Algorithms on NUMA Architectures accepted at ICDM 2018. Full info and PDFs coming soon…
September 2017 – Two New Papers & Two New Members
Two new papers accepted: HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms (ICDM 2017 [acceptance 9%]) and Efficient Document Filtering Using Vector Space Topic Expansion (CIKM 2017 [acceptance 21%]). Also, Julien and Akansha join our lab; welcome!
June 2017 – XI Alumns Are Taking Over the (IT) World
April 2017 – Journal Papers & New Members
Two new journal papers accepted: Storing, Tracking, and Querying Provenance Linked Data (Transactions on Knowledge and Data Engineering) and Managing Big Interval Data with CINTIA the Checkpoint INTerval Array (Transactions on Big Data). Also, Rana, Inès and Giuse join our lab; welcome!
March 2017 – Ph.D. & Postdoc Positions in Big Data
We have a number of new open positions, both for Ph.D. Candidates and for Postdoctoral Researchers. Interested in building the future of Big Data & Data Science? Please apply!
January 2017 – CIDR & WWW 2017
Dependency-driven analytics: a compass for uncharted data oceans [pdf] [slides] [Morning Paper summary] presented at CIDR 2017. And Predicting the Success of Online Petitions Leveraging Multidimensional Time-Series accepted at WWW 2017 (acceptance rate 17%). Hurray!
December 2016 – PhD Award for Roman
Roman wins the JMCS PhD Award for his dissertation. Congrats!
November 2016 – Alisa & Paolo
October 2016 – VoldemortKG presented @ISWC
September 2016 – PrivCheck presented @UbiComp
August 2016 – POIsketch presented @IJCAI
April 2016 – IJCAI Paper, WWW Presentation, and Open Positions!
POIsketch Semantic Place Labeler accepted at IJCAI [preprint pdf], slides from our WWW presentation on Crowd Scheduling, and a number of open Ph.D. positions!
February 2016 – ERC Grant
January 2016 – New Paper Accepted @ JWS; MSc and PhD Awards!
Contextualized Ranking of Entity Types Based on Knowledge Graphs accepted for publication @ JWS, Special Issue on Knowledge Graphs [preprint pdf]. Djellel wins a prize for his PhD thesis on Hybrid Crowdsourcing Systems and Laura wins 2 prizes for her MSc on Anomaly Detection using Spark. Congrats!
December 2015 – New Paper Accepted @ WWW2016 and New Website
Scheduling Human Intelligence Tasks In Multi-Tenant Crowd-Powered Systems accepted for publication @ WWW2016 [acceptance: 16%] [pdf]. And we have a brand new website! Check it out here: exascale.info.
October 2015 – DiploCloud Accepted @ TKDE and 2 Big Events
DiploCloud (our RDF cloud solution) accepted for publication in TKDE [preprint pdf]! And two upcoming events not to be missed: the Swissnex Big Data Day in Fribourg and the International Conference on Web Engineering in Lugano.
September 2015 – Two Papers Accepted @ BigData
Two new papers accepted at IEEE BigData 2015: Online Anomaly Detection over Big Data Streams [pdf] and CINTIA: a Distributed, Low-Latency Index for Big Interval Data [pdf] [acceptance: 17%].
August 2015 – Continuous Evaluation & Keynote
Pooling-Based Continuous Evaluation of Information Retrieval Systems accepted for publication in Information Retrieval; Phil’s keynote @ ICDAR 2015 on Entity-Centric Data Management is now available.
June 2015 – Papers Accepted @ ISWC & VLDB; Best Thesis Award for Dingqi
SANAPHOR: Ontology-Based Coreference Resolution accepted @ ISWC 2015 (acceptance 22%); A Demonstration of TripleProv accepted @ VLDB 2015; and our newest member, Dingqi, wins the Best Thesis Award @ Samovar/Télécom SudParis–congrats!
May 2015 – Big Splash @ WWW2015 and Time Series
April 2015 – LDOW Papers and Big Bio Data
Uduvudu: an Adaptive UI Engine for Linked Data (pdf) and Fixing Linked Data by Context Disambiguation (pdf) accepted at LDOW 2015. And we make our first foray into Big Bio Data with 3DBG accepted for publication in Nucleic Acids Research (joint work w/ McGill University)!
March 2015 – Managing URIs on the Web of Data & Stonebraker's Award
A Comparison of Data Structures to Manage URIs on the Web of Data accepted at ESWC 2015 (acceptance 23%). And our friend from MIT Mike Stonebraker wins the Turing Award! Huge.
February 2015 – BenchPress & New Member
BenchPress (joint work w/ CMU & Microsoft) demo accepted at Sigmod; Phil is Area Editor for JWS and Group Leader @ Sigmod 2016; and Dr. Mourad Khayati joins us from UZH. Welcome!
January 2015 – Two Papers Accepted at WWW 2015
Best way to start 2015: two research papers accepted at the 24th International World Wide Web Conference [acceptance: 14%] ! Executing Provenance-Enabled Queries over Web Data and The Dynamics of Micro-Task Crowdsourcing – The Case of Amazon MTurk . Full PDFs coming soon…
October 2014 – Fully Funded Postdoc Position
The eXascale Infolab (U. of Fribourg–Switzerland) is hiring! We are looking for a highly qualified postdoctoral researcher in Computer Science interested in designing and developing novel information infrastructures to manage big data. See full job description here.
September 2014 – Paper Accepted at BigData 2014
New Smarter Cities paper: TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse Dictionary Compression accepted at IEEE BigData 2014! [acceptance rate: 18%]. Joint work w/ IBM Research. Details here: https://exascale.info/node/286
August 2014 – Papers Accepted at CIKM and VLDB
Our paper on fixing grammatical errors using large N-grams corpora and preposition ranking has been accepted at CIKM (IR track)! Also, TransactiveDB has been accepted at PVLDB. PDFs coming soon…
July 2014 – Scaling-up the Crowd Accepted at HCOMP
Scaling-up the Crowd: Micro-Task Pricing Schemes for Worker Retention and Latency Improvement accepted at HCOMP 2014. See you in Pittsburgh this Fall!
June 2014 – Verisign Distinguished Speaker Series and Journal Papers
Phil was invited for a second time to Verisign’s Distinguished Speaker Series. Also, two journal papers accepted: B-hist: Entity-centric search over personal web browsing history (Journal of Web Semantics) and the Entity Registry System (ERCIM News).
May 2014 – Big Data Slides
The slides from our CUSO Seminar on Big Data are now available online (see our blog post for details)!
April 2014 – WWW Presentations
Our WWW2014 presentations are now available online: NER, TripleProv, and Transactive Search. Also, we wrote a blog post with some of the highlights of the conference.
February 2014 – Transactive Search paper accepted at WWW... and new blog
Our Transactive Search paper has also been accepted at the International World Wide Web Conference (WWW2014 Web Science track)! Also, we now have a blog, check it out here: exascale.info/blog/.
January 2014 – Two Papers Accepted at WWW
Best way to start this new year: two XI papers accepted at WWW 2014! Effective Named Entity Recognition for Idiosyncratic Web Collections, and TripleProv: Efficient Processing of Lineage Queries over a Native RDF Store [acceptance rate 12.9%].
December 2013 – Verisign Labs Distinguished Speakers Series
Slides from Prof. Cudré-Mauroux’s talk on ERS (the Entity Registry System) are online on slideshare [Verisign Labs Distinguished Speakers Series, Internet Infrastructures Grant Winner]
October 2013 – OLTP Benchmark & TRank
August 2013 – Google Award, Big Data & Ruslan
We won a Google Faculty Research Award [Press release: French / German]. Also, our joint paper w/ Verisign on High Velocity Streams has been accepted at IEEE BigData 2013. Last but not least, Ruslan joins us from Yandex. Welcome!
July 2013 – Two Papers Accepted at ISWC
Two new papers accepted at ISWC 2013: TRank: Ranking Entity Types Using the Web of Data [Research Track] and NoSQL Databases for RDF: An Empirical Evaluation [Evaluation Track]. See you soon in Sydney!
June 2013 – New VLDB J. and Internet Computing Articles
Our new articles on Large-Scale Linked Data Integration Using Probabilistic Reasoning and Crowdsourcing and on Scalable Anomaly Detection for Smart City Infrastructure Networks [joint work w/ the IBM Research Smarter Cities Centre] have been accepted for publication by the VLDB Journal and by IEEE Internet Computing.
May 2013 – Heidelberg Laureate Forum & Crowdsourcing Tutorial
Both Gianluca and Martin got selected to attend the Heidelberg Laureate Forum. Thirty-eight Abel, Fields and Turing award winners will [also] be there!
Also, Gianluca’s tutorial slides on Crowdsourcing for the Semantic Web are available.
February 2013 – Paper Accepted at WWW2013
New paper accepted at WWW2013 (acceptance:15%): Pick-A-Crowd: Tell Me What You Like, and I’ll Tell You What to Do.
January 2013 – CIDR Presentation
Gianluca presents his CrowdQ paper at CIDR; Blog Post about it, ppt slides.
December 2012 – Paper Accepted at ECIR
New ScienceWise paper on Ontology-Based Word Sense Disambiguation for Scientific Literature accepted at ECIR 2013.
November 2012 – New XI Member
Dr. Martin Grund from HPI joins us, supported by a generous research grant from SAP. Welcome Martin!
September 2012 – Swiss Computer Science Prize
Prof. Cudré-Mauroux won the Swiss National Center 2001-2012 Research in Computer Science award. Link to the MIC event this September.
August 2012 – Upcoming Events
Two exciting events on the horizon: ISWC 2012 in Boston and DESWEB 2013 in Brisbane (prof. Cudré-Mauroux co-chairs both of them). Don’t miss them!
July 2012 – Verisign Research Grant
We are ecstatic to have won one of the two global research grants from Verisign Inc. Press release here.
June 2012 – Swiss Transportations Android App
Living in Switzerland? Then don’t miss Roman’s Android app offering timetables for Swiss public transportations. Available for free on Google Play.
April 2012 – For a Few Papers More
Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval accepted at SIGIR. An overview of HYRISE in IEEE Data Eng. Bull. Downscaling Entity Registries with VUA and Verisign at DOWNSCALE. Graph Data Management Techniques for the large-scale deployment of Semantic Web technologies invited paper at GDM.
March 2012 – Presentations
Fresh arrival, slide decks on the menu: Entity-Centric Data Management (presentation given at MIT CSAIL, DERI, and IBM Research Smarter Cities Centre); tutorial on Semantic Search (by Demartini et al. at ECIR); tutorial on Scalability in Semantic Data Management (Schloss Dagstuhl).
February 2012 – OLTPBench
Want to benchmark relational or cloud databases? Here is our one-stop, open-source solution:OLTPBench. We hope you’ll like it as much as we do! This is joint work w/ Carlo Curino [Yahoo! Research] and Andy Pavlo [Brown].
January 2012 – ZenCrowd Accepted at WWW
Our latest foray into Online Entity territory was accepted at the World Wide Web conference! [acceptance rate: 12%]
ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking
Abstract:
We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.
Gianluca Demartini, Djellel Eddine Difallah, Philippe Cudré-Mauroux
21st International World Wide Web Conference (WWW2012), Lyon (France), April 16-20, 2012.
Links to the ZenCrowd page and to the pdf of the paper.