Representing and Reasoning on XML Documents: A Description Logic Approach

Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini

J. of Logic and Computation. 9(3):295--318 1999.

Recent proposals to improve the quality of interaction with the World Wide Web suggest considering the Web as a huge semistructured database, so that retrieving information can be supported by the task of database querying. Under this view, it is important to represent the form of both the network, and the documents placed in the nodes of the network. However, the current proposals do not pay sufficient attention to represent document structures and reasoning about them. In this paper, we address these problems by providing a framework where Document Type Definitions (DTDs) expressed in the eXtensible Markup Language (XML) are formalized in an expressive Description Logic equipped with sound and complete inference algorithms. We provide methods for verifying conformance of a document to a DTD in polynomial time, and structural equivalence of DTDs in worst case deterministic exponential time, improving known algorithms for this problem which were double exponential. We also deal with parametric versions of conformance and structural equivalence, and investigate other forms of reasoning on DTDs. Finally, we show how to take advantage of the reasoning capabilities of our formalism in order to perform several optimization steps in answering queries posed to a document base.

