Advanced Data Management Technologies (ADMT)

Academic Year:2017/18, 1st semester
Lecturer:Johann Gamper Lab instructor:Anton Dignös
Lectures:WE 14:00-16:00, FR 08:30-10:30, Room E220
Labs:TU 14:00-16:00, Room E220
Office hours:WE 13:00-14:00, FR 11:00-12:00 or email arrangement Timetable: here

Home | Schedule & Lecture Notes | Project | Assessment

Objectives: Introduce advanced data management technologies that go beyond traditional (relational) database management systems and train students to evaluate the advantages and disadvantages of such technologies in different application contexts.

Prerequisites: Students should be familiar with basic concepts in databases (including relational databases, SQL, and relational algebra) and algorithms, as well as having good pogramming skills.

Syllabus:

References: There is no single textbook that covers the entire course. The course material is collected from various textbooks and research papers. They are indicated on the slides.

More details are in the course presentation form.

Schedule & Lecture Notes

Week Day Date Topic Slides
1 WE 04.10.2017 Introduction: administration and organisation, the DB field, new trends in computing, motivation and need for advanced ADMT pdf
FR 06.10.2017 Basic concepts of BI and DW: business intelligence, from data to information, BI pyramid, BI key problems, DW definition, DW architecture, data integration, multidimensional model, OLAP vs. OLTP pdf
2 WE 11.10.2017 Building a DW: methodological framework, DW design, top-down vs. bottom-up design, DW life-cycle, DW project management, BI competence center pdf
FR 13.10.2017 DW requirements analysis and conceptual design: user needs, conceptual design, facts, dimensions, measures, dimensional fact model, building dimensional fact schemata pdf
3 WE 18.10.2017 DW logical design: logical design, multidimensional model, star schema, snowflake schema, fact/event types, additivity of measures, different types of measures, DW applications, DW implementation pdf
FR 20.10.2017 DW case studies: grocery store example, 4-step DW design process, surrogate keys, degenerate dimensions, junk dimension pdf
4 WE 25.10.2017 DW case studies (contd.): inventory management, order management, OncoNet, and MEDAN case studies, DW bus architecture, DW bus matrix, conformed dimensions, role-playing, fact normalization, multi-valued dimensions, bridge tables
5 No lectures
6 WE 08.11.2017 DW changing dimensions: slowly changing dimensions, overwrite old values, versioning of rows, timestamping of rows, rapidly changing dimensions pdf
FR 10.11.2017 We visit the SFScon 2017: https://www.sfscon.it
7 WE 15.11.2017 Extract-transform-load: ETL process, ETL subsystems, building dimensions and fact tables, extract data, computing deltas, data transformation, normalization, type conversion, loading data, data quality, data cleansing. pdf
FR 17.11.2017 SQL OLAP extensions: SQL query specification; processing SQL queries; GROUP BY extensions: ROLLUP; CUBE; GROUPING SETS; GROUPING_ID. pdf, pdf, sql
8 WE 22.11.2017 Research talk by Natalija Kozmina, University of Latvia: Research projects in Data Warehousing and Business Intelligence pdf
9 WE 29.11.2017 SQL OLAP extensions: analytic/window functions, moving window aggregates, ranking, percentiles, nested aggregates, densification pdf, sql
FR 01.12.2017 DW pre-aggregation and view maintenance: selecting and using pre-aggregation, lattice framework, hierarchies and the lattice framework, cost model, greedy algorithm, optimal cases of the greedy algorithm, incremental view maintenance pdf
10 WE 06.12.2017 Bitmap indexes: bitmap indexing, bitmap compression, run-length encoding, bit-sliced index, bitmap-encoded index, bitmapped join index pdf
11 WE 13.12.2017 NoSQL: motivation, new requirements, CAP theorem, BASE properties, NoSQL taxonomy, key-values stores, column stores, document stores, graph databases pdf
FR 15.12.2017 Excursion to GKN Sinter Metals
12 WE 20.12.2017 MapReduce: programming model, architecture, map and reduce functions, MR extensions and optimizations, execution, limitations, MR implementations, Hadoop pdf, [Li et al. 2014]
13 WE 10.01.2018 MR Execution and error handling: task scheduling in MR, job tracker, task tracker, locality, fault tolerance in MR, stragglers, speculative execution; MR Design Patterns: common pitfalls in designing MR algorithms, design patterns, categorization of design patterns, numerical summarizations, filtering patterns (filtering, Bloom filtering, top ten, distinct), join patterns (reduce side join, replicated join, composite join, Cartesian Product) pdf
FR 12.01.2018 MR Design Patterns: common pitfalls in designing MR algorithms, design patterns, categorization of design patterns, numerical summarizations, filtering patterns (filtering, Bloom filtering, top ten, distinct), join patterns (reduce side join, replicated join, composite join, Cartesian Product) pdf
14 MO 15.01.2018 Distributed Systems: data independence, advantages of distributed systems, networking infrastructure, client-server, peer-to-peer networks, structured and unstructured P2P systems, data replication and consistency, replication policies, failure management pdf
TU 16.01.2018 Distributed hash tables: motivation, linear hashing, centralized linear hashing, distributed linear hashing, consistent hashing, finger tables, Chord pdf
WE 17.01.2018 Main memory databases: definition, concurrency control and commit, data representation and compression, t-tree index, SAP HANA, Oracle TimesTen pdf
FR 19.01.2018 Wrap up

Project

The exercise part of the course consists in the elaboration of a project, which shall be done in groups of 2 students. The project consists in the development and implementation of a Data Warehouse.

The project is documented by a project report and concluded by a final presentation and demo.

Sample project reports: sample project 1, sample project 2.

Milestones

Assessment

The assessment of the course consists of two parts:

Both parts must be positive to pass the exam. The final grade is the weighted average between the two parts. A positive project is a pre-requisite to do the exam.

Here are some of the old exams: 19.09.2018, 30.01.2018, 10.07.2017, 31.01.2017, 02.02.2016, 15.09.2015, 06.07.2015, 28.01.2015, 10.06.2014, 10.02.2014.