Bachelor and Master Thesis Projects on Recommender Systems

Supervisor: Francesco Ricci
(October 3, 2008)

These are just some project proposals. Please contact me for updated information.
If you are a Bachelor student you may also consider the following proposals for internship at  www.ectrlsolutions.com.
Moreover, if you are interested in "A mobile peer-to-peer recommender system", please look at this master thesis project to be developed at The Mobile Life Centre, Stockholm University (Sweden).

Recommending Generalized Products in Collaborative Filtering (Master)

In the classical collaborative filtering recommendation approach, the rating prediction method is based on the computation of the similarity of the active user, to whom a recommendation has to be made, with the other users. This similarity is computed by comparing the ratings provided by the two users to a common set of products. In many cases, given two users, there is a small set of products that the two users have co-rated. This creates a major problem to the Collaborative filtering (CF) algorithm as the reliability of the similarity assessment is strongly dependent on the number of co-rated products. Actually two users may have not rated exactly the same products but could have rated products that are similar. If in the user-to-user similarity computation one could exploit that, then the number of user for whom the similarity with the active user could be computed will increase. The objective of this thesis is to investigate this issue and determine an effective new similarity function that could better exploit the available user profile information (ratings). The final objective is to increase the prediction accuracy of the Collaborative Filtering method. A second objective of this research is to identify extensions of the CF method that will make it possible to recommend generalized products. Having defined a concept of similarity of products, the products can then be grouped according to this similarity (hence forming clusters of products) and the goal of the recommender would be to identify what clusters could be recommended to a user. Recommending a cluster of products, e.g., products sharing some common characteristic could help in many ways the process: the user can better understand the rationale of a recommendation, as the recommended cluster would be characterized by a small set of common features, the user can better perceive the richness of the catalogue without and extensive browsing, the user is not pushed with not-negotiable recommendation, but has the space to choose the best option among a set of suggested products, i.e., those belonging to the cluster.


Evaluating Trip@dvice ranking (Bachelor)
Project to be developed in collaboration with www.ectrlsolutions.com

Trip@dvice is a recommendation methodology based on Case-Based Reasoning (Ricci et al., 2006). Trip@dvice exploits content features, user preferences and the choices made by users in the past, to select and rank in a personalized way the most suitable tourist products. These systems, relying on Case Based reasoning methodology, represent the knowledge necessary to support recommendation functionalities as a set of cases, where each case is a hierarchical XML document containing all the relevant information acquired during an interaction session of the user, like user’ preferences and selected products. This technology has been applied in several operational web sites including  www.visiteurope.com and www.atl.biella.it

In the past we performed only simple evaluations of the goodness of the recommendations, and it would be interesting to measure how the quality of the ranking produced by the system is influenced by the quality/quantity of the case base. In this project the goal is to try to correlate various measures that characterize the case base with the quality of the recommendation. The case base can be measured with respect to: the number of cases, the average similarity of cases, the average size of the cases, the diversity of cases, the quality of cases (external judge), and some others. The quality of the recommendation can be measured with subjective measures (surveys), and with objective measures: time to complete the task, number of page views, rate of success, or the position of the selected items in the displayed ranked list.

Hence to complete this project the student must identify the important measures that characterize the case-base and develop procedures to extract these measures from the log data of the web application. Then these measures must be correlated to the performance of the recommender system, by running some evaluation sessions. The outcome must be a report, describing the state of the cases and the impact of the cases on the system performance. The report should be understandable by the service provider to gain insight into the behavior of the system, its users and their preferences.The project will use real data coming from two portals and will be developed in collaboration with www.ectrlsolutions.com.


Wifi and RFID localization (Bachelor)

The goal of this project is to design and develop a software component supporting localization services using wifi signal and RFID technologies. GPS is a well known technique used nowadays for providing location information to mobile users, but it suffers from several limitations; the most notable is that it is not available inside a building where the satellite signals cannot be detected by the mobile device. In this project the student must provide a technological overview and assessment of some recent technologies exploiting wifi signals transmitted by hot spots and RFID tagged objects, in order to recognize the user location and provide the required location based service.

This localization functionality will be exploited in a eHealth information system aiming at providing mobile information access and activity support to a patient in a day-hospital context (Meran Hospital). The patient will exploit a mobile device to receive information about the next steps in his day-hospital workflow and will use the mobile device to input various kind of data, including an assessment of his psychological status.

In the software design and implementation part of this project the student must extend the Location API (JSR 179) of J2ME. Hence it is required a good knowledge of the Java 2 Micro Edition environment . The component will be exploited by a larger application hence the student must discuss the API with other members of the team. In order to test the developed technologies the student will also develop a system prototype that will work inside the university building, providing routing information and other location based services to the students. The target device for testing the system is a Nokia S60 3rd edition phone.

Push Service for Mobile Devices (Bachelor and Master)

Most of the information services available nowadays are based on a pull model: the user is supposed to ask the required information when she needs that. But some interesting opportunities for mobile information services comes from a different model where the user is pushed by the service provider when the service provider believes that it would be beneficial for the user (and the service provider) to obtain such information. A typical example of this model is location-based advertising, where the user is pushed with ads as he is approaching a shop or a bar, suggesting him to enter the shop and buy a recommended item. Push models have the advantage that the user must not explicitly request the information but have the disadvantage that the user may easily judge the unsolicited information as an intrusion in his life or not appropriate for that particular moment.

The goal of this project is to design and test a methodology for learning what is the right user-context state for delivering an unsolicited request or message to a mobile user. The approach should be based on a model of the context of the user and on a model of the request. The relevant state features should be identified (these can be application-dependent). Then the student must identify and elaborate a learning method, such as multi-arm bandit, to let the system learn from  the reply of the user the best situation for  delivering the message. The student must develop a prototype, using these methodologies, for a eHealth application. The goal here is to obtain reliable information on the state of the patient. So the learning system must optimize the quantity of information that the user will return and the reliability of this information. A major problem that the student must face is the difficulty, for the system, to identify the user context, without asking explicitly the user. So a context understanding module should be developed that will guess the user context from other data, such as the user position (GPS), if the user is moving or not (accelerometer), the time of the day, the activity he is doing (accessing her calendar), if he is making a phone call, etc.


Mobile Travel Planner (Bachelor)
Project to be developed in collaboration with www.ectrlsolutions.com

Trip@dvice is a recommendation methodology based on Case-Based Reasoning (Ricci et al., 2006). Trip@dvice exploits content features, user preferences and the choices made by users in the past, to select and rank in a personalized way the most suitable tourist products. These systems, relying on Case Based reasoning methodology, represent the knowledge necessary to support recommendation functionalities as a set of cases, where each case is a hierarchical XML document containing all the relevant information acquired during an interaction session of the user, like user’ preferences and selected products. This technology has been applied in several operational web sites including  www.visiteurope.com and www.atl.biella.it

MapMobyRek is a mobile recommender system (J2ME based) integrating a conversational preference acquisition technology based on “critiquing” with map visualization technologies. MapMobyRek is a conversational mobile recommender system that can effectively and intuitively support travelers in finding their desired products and services. This system has been developed and tested in previous projects.

The goal of this project is to provide some of the Trip@dvice and MapMobyRek fuctionalities to a mobile user. The major constraint is that the client of this new service (VisitFinland) wants to provide the service to the largest number of mobile phone types (visitors of Finland). Hence the system designer decided to rely on a Wap/XHTML application architecture. This means that the new service will be implemented server-side (in a Java-based web application server) and the (thin) client must deal mainly with visualization. The student must understand both Trip@dvice and MapMobyRek recommendation methodologies and software technologies and work side-by-side with the main software archutect to design this new mobile services, including the functionality and the GUI. The most important functions that will be developed are: visualization of the travel plan, completion of the travel plan, revision of the travel plan in relation with new context-dependent events.


Rank Aggregation and Case-Based Recommender Systems (Master)

Case-based recommender systems rank items/cases using a similarity function that assign a single numeric score to each case in the library, given a partially defined case as input, i.e., a query case. Research has focused on similarity learning, i.e., methods to adapt the similarity function to obtain a better retrieval set, i.e., a set of top ranked items that satisfy as much as possible the user preferences modeled in the query case. A completely unrelated line of research has investigated methods to combine/aggregate a collection of rankings on a set of common items (typically web pages) to produce a common ranking that is as close a possible to the individual rankings. In fact the two problems are strictly related and one can view the similarity-based ranking as an instance of a general problem of rank aggregation. The goal of this research project is to experimentally compare similarity-based ranking and rank aggregation in some real recommendation problems. The hypothesis is that rank aggregation can increase user satisfaction as the user can manually control the aggregation of the rankings produced by different conditions (features) and better conversational systems can be built using this approach. The rank aggregation problem we are addressing is similar to that used to cope with the word association problem, where the goal is to retrieve (sort) the documents that associate to the largest number of query keywords. To measure the quality of the different rankings we shall use the method proposed by Joachims, that relies on the comparison of clicks received by items in two rankings presented in a merged form to the user.

Generation of Semi-Synthetic Context Enriched Rating Data (Bachelor)

Recommender systems are powerful tools helping on-line users to overcome information overload. One way to improve the accuracy of the system is to exploit contextual information related to the user and the item. Contextual data may include information such as location, time, weather, needs and preferences, traffic condition, etc. Contextual data varies greatly according to the type of items.
The goal of this project is to design, develop and validate a component for generating semi-synthetic context enriched rating data for a travel planning recommender system. The component would combine content-based and knowledge-based recommender system approaches to generate precise ratings and therefore enable the recommender system to achieve good accuracy and scalability. The users of the system will specify their preferences about the features of some Places of Interest (POI) in a given context, and this information will be used to generate (predict) their ratings for yet inexperienced POIs in different contexts. Here, the challenge is to create a meaningful way to model the rating dependencies on yet unseen context and integrate the expert knowledge into the prediction process.
The generated data will be used to bootstrap a POI recommender system for Bolzano city. This is an ongoing project, which aims to build a technology for real time revision of the recommendation list in tourism domain.Moreover, the data will also be used for benchmarking context-aware Collaborative Filtering (CF) systems.
The outcome of the thesis would be i) a context-sensitive rating prediction model for POIs, ii) a web-based system for collecting on-line user preferences, and iii) the analysis and evaluation of the data generation (rating prediction) procedure: both user study and off-line experiments on system scalability and flexibility.