Cost-Driven Ontology-Based Data Access (Extended Version)

Davide Lanti, Guohui Xiao, and Diego Calvanese

Technical Report, arXiv.org e-Print archive. CoRR Technical Report arXiv:1707.06974 2018. Available at https://arxiv.org/abs/1707.06974.

In ontology-based data access (OBDA), users are provided with a conceptual view of a (relational) data source that abstracts away details about data storage. This conceptual view is realized through an ontology that is connected to the data source through declarative mappings, and query answering is carried out by translating the user queries over the conceptual view into SQL queries over the data source. Standard translation techniques in OBDA try to transform the user query into a union of conjunctive queries (UCQ), following the heuristic argument that UCQs can be efficiently evaluated by modern relational database engines. In this work, we show that translating to UCQs is not always the best choice, and that, under certain conditions on the interplay between the ontology, the mappings, and the statistics of the data, alternative translations can be evaluated much more efficiently. To find the best translation, we devise a cost model together with a novel cardinality estimation that takes into account all such OBDA components. Our experiments confirm that (i) alternatives to the UCQ translation might produce queries that are orders of magnitude more efficient, and (ii) the cost model we propose is faithful to the actual query evaluation cost, and hence is well suited to select the best translation.


@techreport{Corr-2018-obda,
   title = "Cost-Driven Ontology-Based Data Access (Extended Version)",
   year = "2018",
   author = "Davide Lanti and Guohui Xiao and Diego Calvanese",
   institution = "arXiv.org e-Print archive",
   number = "arXiv:1707.06974",
   note = "Available at https://arxiv.org/abs/1707.06974",
}
pdf url