This page describes projects for BSc/MSc internships and theses in the area of DB, in particular temporal and spatio-temporal databases. A project typically consists in the development, implementation, and evaluation of algorithmic solutions. Some projects are embedded in a collaboration with external partners. For more information or new project proposals, please contact me at gamper_at_inf.unibz.it or Anton Dignös at dignoes_at_inf.unibz.it
Temporal data is ubiquitous, and we can observe an increasing interest in temporal data in recent years, e.g., web data analytics and streaming data. In contrast, the support for processing such data in DBMSs is limited. The SQL standard and most DBMSs offer support to store temporal data, but the support for query processing is very limited.
Project 1: Analysis of Temporal Features in SQL and DBMSs: The aim of this project is to study the temporal features in SQL and the most important DBMSs, such as DB2, Oracle, SQL server, Teradata, Postgres. This analysis includes also an experimental part, which aims to create instances of temporal databases using the above systems, and to perform an experimental evaluation of basic query processing tasks.
Aggregation is an important yet time-consuming operation in temporal databases. The following paper describes an expressive temporal aggregation operator, termed TMDA, which generalizes a variety of previously proposed temporal aggregation operators: M. Böhlen, J. Gamper, and C.S. Jensen. Multi-dimensional aggregation for temporal data. In Proc. of EDBT-06, pages 257-275, 2006. The following student projects aim at improving the current TMDA algorithm:
Project 2: TMDA for MIN/MAX aggregates: The current version of TMDA works only for the SUM/COUNT/AVG aggregtes. TMDA is essentially a sweep line algorithm that scans the tuples in chronological order. The so-called endpoint tree keeps the current tuples (already encountered but not yet finished) in main memory. This structure is not suitable for the MIN/MAX aggregates. This project aims to extend TMDA, and in particular the endpoint tree, for MIN and MAX aggregates and possibly other aggregates.
Project 3: Aggregating almost sorted data. TMDA assumes the input data to be chronologically ordered, hence the data is first sorted. In many applications, however, data already arrive in chronological order or almost chronological order, e.g., streaming data. The aim of this work is not to sort the input data, and instead to skip tuples that are out of order and to compute immediately an approximate solution. If more time is available, the precision of the approximate result can be improved by integrating out of order tuples.
An isochrone for a given query point, q, is defined as the set of all points on a road network from where q is reachable in a given timespan. Isochrones can be used in reachability or catchment analysis, which find many applications. For instance, schools, airports, stores, and public services are all alike interested in determining the area from which clients can be attracted, i.e., can reach the point in a given time. A prototype system for the computation of ischrones is available here. The following projects extend the computation of ischrones in various directions.
Project 1: Incremental Computation of Isochrones. The aim is to develop an efficient algorithm to incrementally compute isochrones when the values of one or more input parameters change, such as the arrival time or the timespan. For instance, if the arrival time changes in steps of 5 minutes (7am, 7:05am, 7:10am, ...), one could analyse how an isochrone changes during a day, depending on the frequency of buses. Instead of computing the isochrone from scratch for each new parameter, the proposed algorithm should work incrementally.
Project 2: Average Reachability. When the public transport system is considered, the size of an isochrone, and hence the reachability of the query point, depends on the time of the day. This project aims to compute the average reachability of a query point during a day. The work requires first to come up with a definition of average reachability and then to implement an efficient algorithm to compute the average reachability of a query point.
Missing citation detection in bibliographic database Scopus.
BSc/MSc Project, FUB Library, contacts: Dr. Karin Karlics,
The aim of this project is the development of a program that (semi-)automatically identifies missing references in the Scopus database. The program gets as input a scientific article (or a set of articles) and has to retrieve other articles in Scopus citing it, but the citations are not recorded in the database. [more]
Organization and configuration of a raster database for
operational integrated environmental monitoring. MSc Project,
EURAC research, contacts: Dr. Roberto Monsorno, +39 0471 055932,
In the light of an ever-rising amount of data from an increasing number of earth observation satellites, we are in need of an efficient data organization for our research and operational products at EURAC. We are producing different products and maps for the alpine area like biophysical parameters or snow maps. [more]
Automatic pattern recognition and data integration model for UAV
imagery. MSc Project, EURAC research, contacts: Dr. Roberto
Monsorno, +39 0471 055932, firstname.lastname@example.org
Are you interested in the use and application of new technologies based on Unmanned Aerial Vehicles (UAV/drones)? Do you possess strong programming skills such as writing of algorithms? Do you want to develop feature recognition software finding out patterns in hundreds of images in order to create spatial maps in high-resolution? The Institute of Applied Remote Sensing at EURAC research is looking for a student who wants to write his/her master thesis in the field of software computer vision and pattern recognition. [more]
Aggregating Millions of Data per Second in Real-time.
MSc project, Würth Phönix, http://www.wuerth-phoenix.com,
contacts: Georg Kostner (email@example.com), Lina
The aim of this MSc thesis project is to develop an efficient data analysis platform based on Naiad, which is a a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. [more]
Query and visualize GIS data. MSc project, Hydro Safety
The aim of the thesis is the implementation of a utility tool to query and visualize specific GIS data inside a report. The GIS data represent survey data collected during the inspection of galleries in hydro-powerplant infrastructures. [more]
If you are interested in one of the above projects, please contact me (gamper_at_inf.unibz.it) or Anton Dignös (dignoes_at_inf.unibz.it) or any of the persons indicated in the project.