MOUNA: Mining Opinions to Unveil Neglected Arguments
The goal of this project is to diversify search results of queries about any topic that can be controversial.
Examples include looking for different versions about of the same breaking news or any topic where people can have different view points such as "Greece bailout", "Obama's Second Term", "Abortion" or "human evolution".
Users seeking for such type of knowledge are not necessarily interested in a specific way of looking at a given issue, but possibly learning about that topic and interested in finding all different view points.
Thus, it is important to focus on how best to produce a set of diversified results that cover different sentiments and arguments showing for which reason a given view point was given.
Links: Talk at Politics, Elections and Data Workshop, PLEAD, CIKM2012, Demo
RARE: Reducing Antimicrobial Resistance in Bolzano
This project is funded by the Province of Bolzano. It addresses the problem of resistance to antimicrobial agents, also called antibiotics, which is a growing public health concern world wide. When a person is infected with an antibiotic-resistant microbe, not only the treatment of that patient is more difficult, but the antibiotic-resistant microbe may spread to other people. This problem calls for immediate actions to define the challenges of antimicrobial resistance and develop solutions to improve the quality of therapies prescribed to patients. The goal of this project is to use data mining techniques to analyze the growing number of available data about past therapies and discover possible correlations among data instances that can be used to improve the effectiveness of antibiotics. The project is carried out in tight collaboration with the Antimicrobial Management Program from Bolzano Hospital. It involves three main phases (1) acquire knowledge about the data and understand the issues of antimicrobial resistance, (2) analyze patients and therapies information using data mining tools, and (3) offer potential solutions to improve the quality of antibiotics prescriptions.
K2: Knowledge Kaleidoscope
I was involved in the K2 project that takes place at Max-Planck Institute for Informatics in Saarbruecken/Germany. With the proliferation of photo and video footage on the Web, a knowledge base would not be complete without multimodal data on individual entities (people, places, etc.) and important events (concerts, award ceremonies, soccer matches, etc.). While photos of celebrities are abundant on the Internet, they are much harder to retrieve for less popular entities such as notable computer scientists or regionally interesting churches. Querying the entity names in image search engines yields large candidate lists, but they often have low precision and unsatisfactory recall. Moreover, even for more prominent targets, it is desirable to have a diverse collection of photos (e.g., from different time periods), some of which might be rare and difficult to locate using search engines. In some cases, the ambiguity of the entity name dilutes the search engine results. An example is the Berkeley professor and former ACM president David Patterson. None of the top-20 Google image or Bing image results (as of August 2009) show him; most show the governor of New York (whose name is actually David Paterson). An approach to overcome these problems is presented in our work: Gathering and Ranking Photos of Named Entities with High Precision, High Recall, and Diversity. It is based on knowledge-driven query expansions and weighted ensemble voting on the results.
SAPIR: Search In Audio Visual Content Using Peer-to-peer IR
I was involved in SAPIR, a European project that extends the power of web searches beyond centralized text and metadata searches to include distributed audio-visual content.Today, Web searches are dominated by search giants such as Google, Yahoo, or MSN that deploy a centralized approach to indexing and utilize text-only indexes enriched by page rank algorithms. Supporting real content-based, audio-visual search requires media-specific understanding and extremely high CPU utilization, which would not scale in today's centralized solutions. SAPIR aims at breaking this technological barrier by developing a large-scale, distributed P2P architecture that will make it possible to search audio-visual content using the query-by-example paradigm. Our vision is to conduct innovative research that leads to a technology where end-users are peers that can produce audio-visual content from their mobile devices. This content is indexed by super-peers across a scalable P2P network to enable content searches in real-time, while respecting IPR and protecting against spam. To this end, SAPIR brings experts in audio-visual content understanding in the areas of text, audio, image, video, and music analysis. A common framework for feature extraction from all media contents is developed for similarity search and ranking along all supported media. To address scalability issues, we developed a P2P architecture where features can be extracted in one peer and pushed to an indexing peer. The P2P architecture provides a scalable indexing structure that can be used for multi-feature search.