Optimized Querying of Integrated Data over the Web

Andrea Calì and Diego Calvanese

Proc. of the IFIP WG8.1 Working Conference on Engineering Information Systems in the Internet Context (EISIC 2002). 2002.

Information Integration is the problem of providing a uniform access to multiple and heterogeneous data sources. The most common approach to this task, called global-as-view, consists in providing a global schema of the data, in which each relation is defined as a view over a set of data sources. Recent works deal with this problem in the case of limited source capabilities, where, in general, sources can only be accessed respecting certain binding patterns for their attributes. In this case, computing the answer to a user query over the global schema cannot be done by simply substituting the concepts appearing in the query with their definitions. Instead, it may require the evaluation of a suitable recursive Datalog program. In this paper we study the evaluation of conjunctive queries in the global-as-view approach with limited source capabilities. We first present an algorithm for optimizing query answering which takes into account the structure of the query together with the binding patterns in order to compute an optimized query plan. The optimization allows for excluding from the query plan the sources that are not relevant for the answer. We then study online optimization of query answering by taking into account full inclusion and functional dependencies between sources. Such an optimization, at a certain step of the answering process, uses the dependencies together with the data retrieved so far to avoid unnecessary accesses to the sources.


@inproceedings{EISIC-2002,
   title = "Optimized Querying of Integrated Data over the Web",
   year = "2002",
   author = "Andrea Calì and Diego Calvanese",
   booktitle = "Proc. of the IFIP WG8.1 Working Conference on Engineering
Information Systems in the Internet Context (EISIC 2002)",
   pages = "285--301",
   publisher = "Kluwer Academic Publisher",
}