https://www.inf.unibz.it/~calvanese/teaching/23-24-dpi/
Free University of Bozen-Bolzano
Faculty of Engineering
Master in Computing for Data Science
Home page of
Data Preparation and Integration (standalone course)
Data Preparation and Integration module of Data Curation
A.Y. 2023/2024
News
Course Description
for Data Curation (of which
"Data Preparation and Integration" is the first module).
Objectives. The Data Preparation and Integration module addresses a
variety of problems related to the integration of heterogenous data
sources. It overviews the main issues in data integration, notably handling
different forms of heterogeneity, and presents the general architecture of
data integration systems. Foundational techniques for data integration are
covered, such as data matching, schema matching and mapping, and query
processing in data integration. A specific data integration approach relying
on the technology of Virtual Knowledge Graphs and semantic mappings is
presented in detail. The integration both of relational data sources, and of
other types of data sources accessed by relying on data federation technology
are considered. By attending the course, students will learn how to design
and build a comprehensive data integration solution, possibly exploiting
existing data access and data federation technologies.
Prerequisites. Knowledge of relational databases, as taught in an
introductory course at the BSc level. Basic knowledge of first-order logic, as
taught in a BSc course in logic or discrete mathematics. Knowledge of Java or
Python for the project part.
Teaching material
- Data Integration. Anhai Doan, Alon Halevy, Zachary Ives.
Morgan Kaufmann, 2012.
Available at University Library Bozen: 13-Textbook Collection (ST 270 D631).
- Data Integration (Course Slides).
Diego Calvanese. 2023.
The slides will be made available during the course and can be downloaded
from the MS Team for the Data Integration course.
- Esercises solved in class.
The exercises will be assigned for the exercise hours, and the solutions
will be made available in the following week on MS Team for the Data
Integration course.
- Office hours
- Schedule: The course is taught in the 1st semester: from 2 October 2023
to 19 January 2024:
- Lectures:
- Monday 8:00-10:00 (typically in Lecture Room A3.17)
- Tuesday 10:00-12:00 (typically in Lecture Room A3.17)
- Exercises: Tuesday 14:00-16:00 (typically in Lecture Room A3.17)
See also the
on-line timetable page for changes.
- Exam dates
- Winter session: TBD
- Summer session: TBD
- Autumn session: TBD
- Rules for the exam
- The final mark will be based on:
- a project [50% of mark], and
- a final oral exam [50% of mark].
The final mark is computed as the average of the oral exam mark (50%)
and the project mark (50%).
- The oral exam consists of a discussion of the project, and an
examination about the topics covered in the course.
- The discussion of the project will take between 15 and 20
minutes, and will include showing the functionalities of the
developed data integration application. Projects developed by two
students in collaboration will be discussed by the two students
together.
-
For the examination, each student will be assigned three topics,
among which the student should choose two for the discussion in
oral form, in roughly 10 minutes each. The three topics will be
assigned to the student roughly 15 minutes in advance of the
discussion, so that the student has time to prepare
herself/himself. The student can (and actually is encouraged to)
prepare during these 15 minutes written notes that can aid her/him
during the discussion. No additional written material can be used
during the preparation of the notes or during the discussion.
- The exam is considered passed when both marks are valid, i.e., in the
range 18-30. Otherwise, the individual valid mark (if any) is kept for
all 3 regular exam sessions, until also ther other part is completed
with a valid mark. After the 3 regular exam sessions, all marks become
invalid.
- Guidelines for the
Data Curation
Final Project
(joint project between Data Preparation and Integration
and Data Profiling)
teaching page of Diego Calvanese
Last modified:
Wednesday, 2-Oct-2024 10:47:04 CEST