Nikolaus Augsten, PhD

Nikolaus Augsten (photo)  

Position: Assistant professor ("RTD") in the Database and Information Systems Group, Free University of Bozen-Bolzano, Italy.

PhD: I received my PhD from Aalborg University, Denmark, in 2008. My supervisor was Prof. Michael Böhlen (University of Zurich); the assessment committee was composed of Prof. Christian S. Jensen (University of Aalborg, Denmark), Prof. Peter Buneman (University of Edinburgh, UK), and Prof. Stefano Ceri (Politecnico di Milano, Italy).

Address
Center for Database and Information Systems (DIS)
Faculty of Computer Science
Free University of Bozen-Bolzano
Dominikanerplatz 3 (office 2.19)
39100 Bozen-Bolzano, Italy

Tel: +39-0471-016111

email: augsten_AT_inf_DOT_unibz_DOT_it

I visited Prof. Alfons Kemper at Technische Universität München (TUM, Munich, Germany) in 2010/2011. My TUM website is here.

Main Research Interests

My current research interests include data-centric applications in database and information systems with a particular focus on approximate matching techniques for complex data structures, efficient index structures for distance computations, and similarity search in massive data collections. My research is triggered by problems that arise in concrete applications, for example, e-government and XML search engines.

Selected Publications

  1. S. Helmer, N. Augsten, M. Böhlen. Information-theoretic approaches for measuring the structural similarity of semistructured documents. To be published in: The VLDB Journal (VLDBJ). (GRIN*: A)
  2. B. Gufler, N. Augsten, A. Reiser, A. Kemper. Load Balancing in MapReduce Based on Scalable Cardinality Estimates. In Proceedings of the International Conference on Data Engineering (ICDE-12), Washington, DC, USA. April 2012. IEEE Computer Society. To appear. [PDF] (GRIN*: A)
  3. M. Pawlik, N. Augsten. RTED: A Robust Algorithm for the Tree Edit Distance. In PVLDB 5(4):334-345, 2011 (VLDB-12). [PDF] (GRIN*: A)
  4. N. Augsten, M. Böhlen, C. Dyreson, and J. Gamper. Windowed pq-Grams for Approximate Joins of Data-Centric XML. To be published in: The VLDB Journal (VLDBJ). DOI: 10.1007/s00778-011-0254-6. [PDF] (GRIN*: A)
  5. N. Augsten, D. Barbosa, M. Böhlen, and T. Palpanas. Efficient top-k approximate subtree matching in small memory. In IEEE Transactions on Knowledge and Data Engineering (TKDE). 23(8): 1123-1137 (2011). [download] (GRIN*: A)
  6. N. Augsten, D. Barbosa, M. Böhlen, and T. Palpanas. TASM: Top-k approximate subtree matching. In Proceedings of the International Conference on Data Engineering (ICDE-10), pages 353-364, Long Beach, California, USA, March 2010. IEEE Computer Society. [PDF][all downloads] (acceptance rate: 12.9%, GRIN*: A)
    ICDE 2010 Best Paper Award
  7. N. Augsten, M. Böhlen, and J. Gamper. The pq-Gram Distance between Ordered Labeled Trees. In ACM Transactions on Database Systems (TODS), 35(1):1-36, 2010 [PDF][all downloads] (GRIN*: A)
  8. N. Augsten, M. Böhlen, C. Dyreson, and J. Gamper. Approximate Joins for Data Centric XML. In Proceedings of the International Conference on Data Engineering (ICDE-08), Cancún, Mexico, April 2008. IEEE Computer Society. [PDF][all downloads] (acceptance rate: 12.1%, GRIN*: A)
  9. N. Augsten, M. Böhlen, and J. Gamper. An incrementally maintainable index for approximate lookups in hierarchical data. In Proceedings of the 32th International Conference on Very Large Databases (VLDB-06), pages 247-258, Seoul, Korea, Sep. 2006. ACM Press. [PDF][all downloads] (acceptance rate: 13.2%, GRIN*: A)
  10. N. Augsten, M. Böhlen, and J. Gamper. Approximate matching of hierarchical data using pq-grams. In Proceedings of the 31th International Conference on Very Large Databases (VLDB-05), pages 141-152, Trondheim, Norway, Aug.-Sep. 2005. ACM Press. [PDF][all downloads] (acceptance rate: 16.5%, GRIN*: A)

*GRIN: Journal and conference ranking by the Italian Association of Computer Scientists.

Please feel free to download the source code of our implementations here.

Teaching

Courses

Database Management and Tuning (SS2011)
This course will give an in-depth understanding of the features that off-the-shelf database management systems offer, in particular with respect to system performance. This knowledge is used to tune the database system and its environment: dimension the hardware for the database system, write efficient queries, set effective indexes, communicate with the database efficiently, and diagnose performance problems.
(Course Evaluation: SS 2010, SS 2011)
Scalabel Similarity Search Algorithms (Technische Universität München) (WS 2010)
(Course Evaluation: WS 2010)
Similarity Search (WS 2009)
This course will discuss similarity search techniques for flat strings and hierarchical data (for example, XML). Selected methods will be presented, their effectiveness and efficiency will be discussed. Filtering techniques to improve the efficiency will be introduced. The students will implement similarity joins in a relational database management system.
(Course Evaluation: WS 2009)
Approximation: Theory and Algorithms (SS 2009) — old study plan; will not be held in 2010!
This course will discuss approximate matching techniques for flat strings and hierarchical data. Selected methods will be presented, their effectiveness and efficiency will be discussed. Filtering techniques to improve the efficiency will be introduced. The students will implement approximate matching techniques in a relational database management system.
(Course Evaluation: SS 2007, SS 2008, SS 2009)

Thesis Proposals