For the purpose of this tutorial, an Ontology will be considered as a
Conceptual Schema expressed in a suitable conceptual data model (i.e.,
an Ontology Language). Good conceptual data models put their
emphasis on the correct and semantically rich representation of complex properties and relations that may exist between documents.
They should allow for an abstract representation of data which
resembles the way they are actually perceived and used in the real
world, thus shortening (with respect to the more traditional data
models) the semantic gap between the domain and its representation.
Conceptual (or Ontology) modelling deals with the question on how to
describe in a declarative and reusable way the domain information of
an application, its relevant vocabulary, and how to constrain the use
the data, by understanding what can be drawn from it. Recently, a
number of conceptual and ontology modelling languages has emerged as
de-facto standard, in particular we mention Entity/Relationship (ER)
for the relational data model, UML and ODMG for the object oriented
data model, and XML, RDF and DAML+OIL for the web semi-structured data
model. Still, many such languages do not have a formal semantics
based on logic, or reasoners built upon them to support the designer.
Not surprisingly, conceptual modelling tasks have always been in the
mainstream of KR research - see for example the research on Ontology
representation and design - and can be considered now one of the main
applications of KR languages and reasoning
techniques [BB02]. DL can be considered as an
unifying formalism, since they allow the logical reconstruction and
the extension of representational tools such as object-oriented data
models (e.g., UML and ODMG), semantic data models (e.g.,
Entity/Relationship and ORM), frame-based ontology languages (e.g.,
OIL and DAML+OIL)
[CLN98,CLN99,CCDGL01,]. In addition, given the high complexity of the
modelling task when complex data is involved, in the semantic web
field there is the demand of more sophisticated and expressive
languages than for normal information systems. Again, DL research is
very active in providing expressive ontology languages to capture
various aspects of the information
(see, e.g., [AF99,,FS99,BKW02]).
In this tutorial I will present examples using a generic conceptual
data model. I will point out how it generalises both the
object-oriented data model based on UML class diagrams and the
extended Entity-Relationship (EER) semantic data model, and how it is
strictly related to OIL and DAML+OIL. The ontology language includes
taxonomic relations to state containment assertions between
entities and between relationships with the possibility to specify
additional covering and disjointness constraints. The
most interesting feature of the modelling language is the ability to
completely define entities and relationships as views over
other entities and relationships of the
ontology [CLN98]. The adopted view language is DLR
[CGL+98], a Description Logic over unary and n-ary relationships. DLR is an interesting decidable fragment of
first order logic: among others, inclusion dependencies with DLR
views can express (a) unary inclusion dependencies, (b) typed
inclusion dependencies without projection, (c) existence dependencies,
(d) exclusion dependencies, and (e) full key dependencies. DLR is
powerful enough to encode the full EER, the UML class diagrams and
most of DAML+OIL. An informal introduction to the properties of the
DLR Description Logic will be given.
Two additional extensions to the conceptual data model will be also
considered. The first one is with multidimensional aggregations -
that is, the conceptual data model is able to represent the structure
of aggregated entities and of multiply hierarchically
organised dimensions. The ability of representing aggregations at
the conceptual level is crucial in modelling structured documents in
data warehouses, in the semantic web and in digital libraries. The
second one allows for the representation of standard temporal
operators for temporal conceptual modelling and of a large class of
temporal integrity constraints, useful to model the dynamics in the
sematic web.
At the end of this first part, a demo of the i.com
tool [FN00,JQC+00] -
which implements the above conceptual data model as UML class diagrams
or EER schemas - will be given. i.com allows for the specification
of multiple EER (or UML) diagrams and inter- and intra-schema
constraints. Complete logical reasoning is employed by the tool using
an underlying DL inference engine to verify the specification, infer
implicit facts and stricter constraints, and manifest any
inconsistencies during the conceptual modelling phase.