- Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [PDF] [slides] and OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [PDF] [video] presented at WWW 2020;
- Annotating Web Tables through Knowledge Bases: A Context-Based Approach accepted at SDS 2020, and Location Prediction over Sparse User Mobility Traces using RNNs accepted at IJCAI 2020 (acceptance: 12%). Congrats everyone!
Networking and Storage: The Next Computing Elements in Exascale Systems? [PDF] pulished in the March 2020 Data Management at Exascale issue of the IEEE Data Engineering Bulletin.
Two full research papers accepted at The Web Conf. (WWW2020) (acceptance: 19%):
- OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation [preprint pdf] and
- Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction [preprint pdf].
Four papers presented at IEEE BigData 2019:
- DAOC: Stable Clustering of Large Networks
- Revisiting Text and Knowledge Graph Joint Embeddings: The Amount of Shared Information Matters!
- CORAD: Correlation-Aware Compression of Massive Time Series using Sparse Dictionary Coding and
- Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series accepted at VLDB 2020
- Fusing Vector Space Models for Domain-Specific Applications presented at ICTAI 2019
- Scalable recovery of missing blocks in time series with high and low cross-correlations published in Knowledge and Information Systems and
- A Human-AI Loop Approach for Joint Keyword Discovery and Expectation Estimation in Micropost Event Detection accepted at AAAI 2020 [acceptance: 20%]
- It Takes Two: : Instrumenting the Interaction between In-Memory Databases and Solid-State Drives accepted at CIDR 2020
- Event Detection on Microposts: a Comparison of Four Approaches published in TKDE and
- Non-Parametric Class Completeness Estimators for Knowledge Graphs presented at ISWC 2019
- Non-Parametric Class Completeness Estimators for Knowledge Graphs accepted at ISWC 2019 [acceptance: 21%]
- NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching presented at KDD 2019 [acceptance: 14%]
- And slides from our recent graph embeddings invited talk uploaded.
- NodeSketch: Highly-Efficient Graph Embeddings via Recursive Sketching accepted at KDD 2019 [acceptance: 14%]
- Deadline-Aware Fair Scheduling for Multi-Tenant Crowd-Powered Systems published in ACM TSC and
- Playing Atari with Six Neurons nominated for best paper award at AAMAS 2019!
Three new journal papers accepted:
3 papers accepted at the Web Conf (WWW2019)! ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs, Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach, and Scalpel-CD: Leveraging Crowdsourcing and Deep Probabilistic Modeling for Debugging Noisy Training Data, all accepted as full research papers [acceptance: 18%]. PDFs coming soon. See you in San Francisco!
Two new papers accepted: HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms (ICDM 2017 [acceptance 9%]) and Efficient Document Filtering Using Vector Space Topic Expansion (CIKM 2017 [acceptance 21%]). Also, Julien and Akansha join our lab; welcome!
Two new journal papers accepted: Storing, Tracking, and Querying Provenance Linked Data (Transactions on Knowledge and Data Engineering) and Managing Big Interval Data with CINTIA the Checkpoint INTerval Array (Transactions on Big Data). Also, Rana, Inès and Giuse join our lab; welcome!
Two new papers accepted at IEEE BigData 2015: Online Anomaly Detection over Big Data Streams [pdf] and CINTIA: a Distributed, Low-Latency Index for Big Interval Data [pdf] [acceptance: 17%].
Pooling-Based Continuous Evaluation of Information Retrieval Systems accepted for publication in Information Retrieval; Phil’s keynote @ ICDAR 2015 on Entity-Centric Data Management is now available.
A Comparison of Data Structures to Manage URIs on the Web of Data accepted at ESWC 2015 (acceptance 23%). And our friend from MIT Mike Stonebraker wins the Turing Award! Huge.
Best way to start 2015: two research papers accepted at the 24th International World Wide Web Conference [acceptance: 14%] ! Executing Provenance-Enabled Queries over Web Data and The Dynamics of Micro-Task Crowdsourcing – The Case of Amazon MTurk . Full PDFs coming soon…
The eXascale Infolab (U. of Fribourg–Switzerland) is hiring! We are looking for a highly qualified postdoctoral researcher in Computer Science interested in designing and developing novel information infrastructures to manage big data. See full job description here.
New Smarter Cities paper: TRISTAN: Real-Time Analytics on Massive Time Series Using Sparse Dictionary Compression accepted at IEEE BigData 2014! [acceptance rate: 18%]. Joint work w/ IBM Research. Details here: https://exascale.info/node/286
Our paper on fixing grammatical errors using large N-grams corpora and preposition ranking has been accepted at CIKM (IR track)! Also, TransactiveDB has been accepted at PVLDB. PDFs coming soon…
Scaling-up the Crowd: Micro-Task Pricing Schemes for Worker Retention and Latency Improvement accepted at HCOMP 2014. See you in Pittsburgh this Fall!
Best way to start this new year: two XI papers accepted at WWW 2014! Effective Named Entity Recognition for Idiosyncratic Web Collections, and TripleProv: Efficient Processing of Lineage Queries over a Native RDF Store [acceptance rate 12.9%].
Our new articles on Large-Scale Linked Data Integration Using Probabilistic Reasoning and Crowdsourcing and on Scalable Anomaly Detection for Smart City Infrastructure Networks [joint work w/ the IBM Research Smarter Cities Centre] have been accepted for publication by the VLDB Journal and by IEEE Internet Computing.
Also, Gianluca’s tutorial slides on Crowdsourcing for the Semantic Web are available.
New paper accepted at WWW2013 (acceptance:15%): Pick-A-Crowd: Tell Me What You Like, and I’ll Tell You What to Do.
New ScienceWise paper on Ontology-Based Word Sense Disambiguation for Scientific Literature accepted at ECIR 2013.
We are ecstatic to have won one of the two global research grants from Verisign Inc. Press release here.
Living in Switzerland? Then don’t miss Roman’s Android app offering timetables for Swiss public transportations. Available for free on Google Play.
Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval accepted at SIGIR. An overview of HYRISE in IEEE Data Eng. Bull. Downscaling Entity Registries with VUA and Verisign at DOWNSCALE. Graph Data Management Techniques for the large-scale deployment of Semantic Web technologies invited paper at GDM.
Want to benchmark relational or cloud databases? Here is our one-stop, open-source solution:OLTPBench. We hope you’ll like it as much as we do! This is joint work w/ Carlo Curino [Yahoo! Research] and Andy Pavlo [Brown].
Our latest foray into Online Entity territory was accepted at the World Wide Web conference! [acceptance rate: 12%]
ZenCrowd: Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking
We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.
Gianluca Demartini, Djellel Eddine Difallah, Philippe Cudré-Mauroux
21st International World Wide Web Conference (WWW2012), Lyon (France), April 16-20, 2012.