Understanding semantic datasets is crucial in order to properly use them. Unfortunately, the majority of the published semantic datasets lack type information to some extend. For example, DBpedia entities typically only have ~64% of types defined. However, some of the missing types can be inferred from other entities by analysing their mutual properties. Also, new types can be discovered by identifying groups of objects with similar properties.

In this project, the student will need to extend our Statistical Type Inference framework (StaTIX) with the semantic type inference taking into account semantics of the entity attributes and entailment rules. The project provides the opportunity to work on Big Data and to contribute to the Open Knowledge community by refining existing Linked Open Data.