- Editors working draft.
- Live Draft - version:

$Revision: 1.1 $ of $Date: 02 November 2005$ - Editors:
- Enrico Franconi, Free University of
Bozen-Bolzano <>

Sergio Tessaris, Free University of Bozen-Bolzano - This version:
- http://www.inf.unibz.it/krdb/w3c/sparql/
- PDF document of this version (better):
- PDF document.

Copyright © 2005 Free University of Bozen-Bolzano, Faculty of Computer Science, KRDB Research Centre.

This document reflects the best effort of the editors to formalise the SPARQL semantics and incorporates input from various members of the WG, but is not yet endorsed by the WG as a whole.

This is a live document and is subject to change without notice. See also the change log.

See also: DAWG Working Group

Definition 1 (RDF graphs)

We call RDFT the set of RDF terms, namely the union of all RDF URI
references with the set of all literals in their canonical representation and
with all the bnode names; the canonical representation of a literal is a chosen
representative of all the literals associated to the same value, if the literal is non
ill-typed, otherwise it is the literal itself. RDF graphs are defined in the standard
way.

Definition 2 (Pattern Solution)

A pattern solution (aka variable substitution) is a partial function from a finite
subset, possibly empty, of the query variables alphabet to RDFT; optionally, we
may consider pattern solutions mapping to RDFT without bnodes.

Definition 3 (Answer Set)

An answer set is a set of pattern solutions. An answer set is non-redundant if
it does not contain any redundant tuple.

A tuple S AS is redundant in an answer set AS if L(AS) - the logical translation of the answer set - is logically equivalent to L(AS \ {S}) - the logical translation of the answer set diminished by the tuple.

The logic translation L(AS) of an answer set AS is a closed existential predicate logic formula, where URIs and literals (in their canonical representation) are constants and blank nodes are existential variables, and the body is a conjunction of atomic formulae of some predicate R in correspondence with each solution SAS. This correspondence is built as follows:

- the arity of R is equal to the cardinality of the set V
^{A}, which is the union of the active domain of each S; - there is a bijective correspondence between the positions of the arguments
of R and the query variables in V
^{A}; - an atomic formula in the conjunction is built for each S by putting in each position i of the predicate R either the value of S for the variable corresponding to i - if S is defined for it - or a special constant otherwise.

Definition 4 (RDF Merge)

The RDF merge
<G_{1}...G_{n}> of a sequence of graphs <G_{1}...G_{n}> (i.e., a dataset)
is the ordered merge union of the graphs, where repeated bnodes are substituted
with fresh ones, by keeping the names of the bnodes coming first in the sequence
order. Note that, w.r.t. the standard definition of RDF merge, if any of the
graphs contains variables, then those are not renamed (i.e., variables are treated
as URIs).

Definition 5 (Basic Graph Pattern)

A basic graph pattern is a finite RDF graph where query variables are allowed
to label nodes and edges, in alternative to elements of RDFT. Variables may be
annotated with a set of elements of RDFT- to restrict the possible assignments
of variables.

Definition 6 (Answer Set of a BGP query)

Given an RDF graph G and a basic graph pattern query Q, the answer set (resp.
non-redundant answer set) of Q for G is the maximum set (resp. maximum
non-redundant set) of pattern solutions such that for each such S:

- the domain of S is restricted to the query variables in Q;
- the bnodes in the range of S can only be among those in G;
- if a variable is annotated with a set of elements of RDFT then S must map that variable to one of these elements;
- Q [S] - i.e., the application of the substitution S to Q - is a well-typed RDF graph;
- G (G Q) [S]

where is a semantically well defined logical implication relation among RDF graphs, namely either simple entailment, or RDF entailment, or RDFS entailment, or OWL-DL entailment, or OWL-Full entailment.

Lemma 1

- Both the answer set and the non-redundant answer set of a query for a graph are unique.
- An answer set without substitutions to bnodes is always non-redundant.
- In the case of simple, RDF, and RDFS entailment, the answer set of any query for a ground graph (i.e., a graph without bnodes) is always non-redundant.
- In the case of simple entailment, the answer set corresponds to all the
substitutions defined as follows:
- consider all the query variables in Q as fresh bnodes;
- consider all subgraph isomorphisms between Q and G, where bnodes may be mapped to any node;
- a substitution is the projection over the bnodes in Q that were query variables of the above isomorphisms.

Definition 7 (SPARQL Algebra - still informal)

A SPARQL query expression constructs an answer set from an RDF graph, by
combining basic steps of the form of basic query patterns - constructing answer sets
from the RDF graph - with higher level algebraic operators which transform answer
sets. Algebraic operators are:

Optionally, a Construct operator may at the end transform an answer set in an RDF graph.