Mini-Projects of the ATDB course
This web page presents information related to mini-projects of the ATDB course. The students have two possibilities. First, the group can choose a mini-project from the course. Second, the group can come up with its own mini-project proposal.
Mini-project proposals:
- Queries on the histogram-based representation of moving objects probability distributions
Moving objects databases contain data about positions of moving objects. In many cases, this data is probabilistic. For example, traffic management applications need to predict numbers of cars in the streets in the near future (e.g., in 15 mins). Since future speed and direction of cars is uncertain, it makes sense for a prediction algorithm to come up with a probability distribution for the number of cars.
An efficient way of representing probability distributions is based on histograms. For example, the histogram-based representation of a probability distributions for the number of cars on street X may looks as follows: P = {([0;10], 0.2), ([11;20], 0.3), ([21;35], 0.5)}. An interesting type of queries for the histogram-based probability distributions is called probability queries. These are queries of the form "Given a probability distribution, P, for a number of cars in street X, what is the probability that the number of cars is greater than 15". Since the histogram-based probability distributions are approximate, there are several types of answers to probability queries: pessimistic (lower bound), optimistic (upper bound), and weighted ("average").
The goal of this mini-project is to implement a data type for the histogram-based representation of probability distributions and to efficiently implement operations for pessimistic, optimistic, and weighted answers to probability queries.
Literature:
- I. Timko, C. E. Dyreson, and T.B. Pedersen. Extending OLAP with Probabilistic Measures. Submitted to the VLDB Journal. .pdf
- Evaluation of precision of the histogram-based representation of moving objects probability distributions
Moving objects databases contain data about positions of moving objects. In many cases, this data is probabilistic. For example, traffic management applications need to predict numbers of cars in the streets in the near future (e.g., in 15 mins). Since future speed and direction of cars is uncertain, it makes sense for a prediction algorithm to come up with a probability distribution for the number of cars.
An efficient way of representing probability distributions is based on histograms. For example, the histogram-based representation of a probability distributions for the number of cars on street X may looks as follows: P = {([0;10], 0.2), ([11;20], 0.3), ([21;35], 0.5)}. Since the histogram-based probability distributions are approximate, it is important to evaluate their precision. This can be done by computing the Kullback-Leibler divergence between the "true" distribution and its histogram-based representation.
The goal of this mini-project is to implement a data type for the histogram-based representation of probability distributions and to efficiently implement operations for computing the Kullback-Leibler divergence between distributions.
Literature:- I. Timko, C. E. Dyreson, and T.B. Pedersen. Extending OLAP with Probabilistic Measures. Submitted to the VLDB Journal. .pdf
- Wikipedia. Kullback-Leibler divergence. link
SECONDO system
The common goal of the mini-projects of the MS course is to extend a data model and a query language of SECONDO database management system. The extension will take the form of a collection of related data types with associated operations. In SECONDO, this collection is called algebra module.
This is a list of important documentation about SECONDO (you can find more on their homepage):
- Overview of SECONDO
- User Manual
- For programmers:
It is recommended to install SECONDO on your computer. Useful links:
How to run Secondo in the computer lab:
- Start SECONDO server:
- Start a new shell.
- Type: source /opt/secondo/.secondorc
- Type: cd /opt/secondo/bin
- Type: ./SecondoMonitor
- After some messages, type: startup
- Start query optimizer:
- Start a new shell.
- Type: source /opt/secondo/.secondorc
- Type: cd /opt/secondo/Optimizer/
- Type: ./StartOptServer
- Start Java GUI:
- Start a new shell.
- Type: source /opt/secondo/.secondorc
- Type: cd /opt/secondo/Javagui/
- Type: ./sgui
