Discovering Functional Dependencies for Multidimensional Design

Oscar Romero, Diego Calvanese, Alberto Abello, and Mariano Rodriguez-Muro

Proc. of the 12th ACM Int. Workshop on Data Warehousing and OLAP (DOLAP 2009). 2009.

Nowadays, it is widely accepted that the data warehouse design task should be largely automated. Furthermore, the data warehouse conceptual schema must be structured according to the multidimensional model and as a consequence, the most common way to automatically look for subjects and dimensions of analysis is by discovering functional dependencies (as dimensions functionally depend on the fact) over the data sources. Most advanced methods for automating the design of the data warehouse carry out this process from relational OLTP systems, assuming that a RDBMS is the most common kind of data source we may find, and taking as starting point a relational schema. In contrast, in our approach we propose to rely instead on a conceptual representation of the domain of interest formalized through a domain ontology expressed in the DL-Lite Description Logic. We propose an algorithm to discover functional dependencies from the domain ontology that exploits the inference capabilities of DL-Lite, thus fully taking into account the semantics of the domain. We also provide an evaluation of our approach in a real-world scenario.

