Virtual Knowledge Graphs for Data Integration

Course at the
6th International Winter School on Big Data (BigDat 2020)
Ancona, Italy, 13-17 January 2020

Diego Calvanese

Research Centre for Knowledge and Data (KRDB)
Free University of Bozen-Bolzano

Department of Computing Science
UmeƄ University, Sweden

Slides of the course


Recently, semantic technologies have been successfully deployed to overcome the typical difficulties in accessing and integrating data stored in different kinds of legacy sources. In particular, knowledge graphs are being used as a mechanism to provide a uniform representation of heterogeneous information. In such graphs, data are represented in the RDF format, and are complemented by an ontology that can be queried using the standard SPARQL language. The RDF graph is often obtained by materializing source data, following the traditional extract-transform-load workflow. Alternatively, the sources are declaratively mapped to the ontology, and the RDF graph is maintained virtual. In such an approach, usually called Virtual Knowledge Graphs (VKG), query answering is based on sophisticated query transformation techniques. In this tutorial:
  1. we provide a general introduction to relevant semantic technologies;
  2. we illustrate the principles underlying the VKG approach to data integration, providing insights into its theoretical foundations, and describing well-established algorithms, techniques, and tools;
  3. we discuss relevant use-cases using VKGs;
  4. we provide a hands-on experience with the stat-of-the-art VKG system Ontop.


  1. Motivation
  2. Virtual Knowledge Graphs for Data Access
  3. VKG Framework
  4. VKG Systems and Usecases
  5. Query Answering over VKGs
  6. Recent Developments and Future Plans
  7. Conclusions
  8. Hands-on Exercises


Prerequisite Knowledge

Basics about relational databases, first-order logic, and data modeling, as typically taught in BSc-level Computer Science courses. A background in logics for knowledge representation, description logics, and complexity theory, might be useful to establish cross-connections, but is not required to follow the course.

Course Duration

Three lectures of 1.5 hours each.

Back to home page of Diego Calvanese
Last modified: Friday, 31-Jan-2020 2:41:02 CET