Advanced Data Management Technologies (ADMT)

Academic Year:2016/17, 1st semester
Lecturer:Johann Gamper Lab instructor:Anton Dignös
Lectures:TU, TH 08:30-10:30, Room E420
Labs:TU 14:00-16:00, Room E412
Office hours:TU, TH 13:00-14:00 or email arrangement

Home | Schedule & Lecture Notes | Project | Assessment

Objectives: Introduce advanced data management technologies that go beyond traditional (relational) database management systems and train students to evaluate the advantages and disadvantages of such technologies in different application contexts.

Prerequisites: Students should be familiar with basic concepts in databases (including relational databases, SQL, and relational algebra) and algorithms, as well as having good pogramming skills.

Syllabus:

References: There is no single textbook that covers the entire course. The course material is collected from various textbooks and research papers. They are indicated on the slides.

More details are in the course presentation form.

Schedule & Lecture Notes

1. TH 06.10.2016 Introduction: administration and organisation, the DB field, new trends in computing, motivation and need for advanced ADMT pdf
2. TU 11.10.2016 Basic concepts of BI and DW: business intelligence, from data to information, BI pyramid, BI key problems, DW definition, DW architecture, data integration, multidimensional model, OLAP vs. OLTP pdf
3. TH 13.10.2016 Building a DW: methodological framework, DW design, top-down vs. bottom-up design, DW life-cycle, DW project management, BI competence center pdf
4. MO 17.10.2016 DW requirements analysis and conceptual design: user needs, conceptual design, facts, dimensions, measures, dimensional fact model, building dimensional fact schemata pdf
5. TU 18.10.2016 DW logical design: logical design, multidimensional model, star schema, snowflake schema, fact/event types, additivity of measures, different types of measures, DW applications, DW implementation pdf
6. MO 24.10.2016 DW case studies: grocery store example, 4-step DW design process, surrogate keys, degenerate dimensions, junk dimension pdf
TU 25.10.2016 Lecture canceled
TH 27.10.2016 Lecture canceled
7. TH 03.11.2016 DW case studies (contd.): inventory management, order management, OncoNet, and MEDAN case studies, DW bus architecture, DW bus matrix, conformed dimensions, role-playing, fact normalization, multi-valued dimensions, bridge tables
8. TU 08.11.2016 Research talk by Natalija Kozmina, University of Latvia: Research projects in Data Warehousing and Business Intelligence pdf
9. TH 10.11.2016 DW changing dimensions: slowly changing dimensions, overwrite old values, versioning of rows, timestamping of rows, rapidly changing dimensions pdf
10. FR 11.11.2016 We visit the SFScon 16: https://www.sfscon.it
11. TU 15.11.2016 Extract-transform-load: ETL process, ETL subsystems, building dimensions and fact tables, extract data, computing deltas, data transformation, normalization, type conversion, loading data, data quality, data cleansing. pdf
12. TH 17.11.2016 SQL OLAP extensions: SQL query specification; processing SQL queries; GROUP BY extensions: ROLLUP; CUBE; GROUPING SETS; GROUPING_ID. pdf, pdf, sql
13. MO 21.11.2016 SQL OLAP extensions: analytic/window functions, moving window aggregates, ranking, percentiles, nested aggregates, densification pdf, sql
14. TU 22.11.2016 SQL OLAP extensions: (contd.)
15. MO 28.11.2016 GMD-join: Definition of GMD-Join, 2D cumulative aggregates, transformation into SQL, evaluation algorithms, transformation rules, incremental computation of aggregates. pdf [Akinde et al. 11] [Chatziantoniou et al. 01] [Akinde et al. 02] [sql]
16. TU 29.11.2016 DW pre-aggregation and view maintenance: selecting and using pre-aggregation, lattice framework, hierarchies and the lattice framework, cost model, greedy algorithm, optimal cases of the greedy algorithm, incremental view maintenance pdf
17. MO 05.12.2016 DW pre-aggregation and view maintenance: (contd.)
18. TU 06.12.2016 Bitmap indexes: bitmap indexing, bitmap compression, run-length encoding, bit-sliced index, bitmap-encoded index, bitmapped join index. pdf
19. TU 13.12.2016 NoSQL: motivation, new requirements, CAP theorem, BASE properties, NoSQL taxonomy, key-values stores, column stores, document stores, graph databases pdf
20. TH 15.12.2016 MapReduce: programming model, architecture, map and reduce functions, MR extensions and optimizations, execution, limitations, MR implementations, Hadoop pdf, [Li et al. 2014]
21. FR 16.12.2016 Excursion to GKN Sinter Metals
22. MO 19.12.2016 MR Execution and error handling: task scheduling in MR, job tracker, task tracker, locality, fault tolerance in MR, stragglers, speculative execution; MR Design Patterns: common pitfalls in designing MR algorithms, design patterns, categorization of design patterns, numerical summarizations, filtering patterns (filtering, Bloom filtering, top ten, distinct), join patterns (reduce side join, replicated join, composite join, Cartesian Product) pdf, pdf
23. TU 20.12.2016 Distributed Systems: data independence, advantages of distributed systems, networking infrastructure, client-server, peer-to-peer networks, structured and unstructured P2P systems, data replication and consistency, replication policies, failure management pdf
24. TU 10.01.2017 Distributed hash tables: motivation, linear hashing, centralized linear hashing, distributed linear hashing, consistent hashing, finger tables, Chord pdf
25. TH 12.01.2017 Main memory databases: definition, concurrency control and commit, data representation and compression, t-tree index, SAP HANA, Oracle TimesTen pdf
26. TU 17.01.2017 Wrap up

Project

The exercise part of the course consists in the elaboration of a project, which shall be done in groups of 2 students. The project consists in the development and implementation of a Data Warehouse.

The project is documented by a project report and concluded by a final presentation and demo.

Sample project reports: sample project 1, sample project 2.

Milestones

Assessment

The assessment of the course consists of two parts:

Both parts must be positive to pass the exam. The final grade is the weighted average between the two parts. A positive project is a pre-requisite to do the exam.

Here are some of the old exams: 02.02.2016, 15.09.2015, 06.07.2015, 28.01.2015, 10.06.2014, 10.02.2014.