European Masters Program in Language and Communication Technologies


Language and Communication Technologies Colloquia

The Colloquia will take place every other Thursday during the months of December 2004-June 2005 accordingly to the calendar below. All seminars are held in English, unless noted differently.

For more information, please contact Raffaella Bernardi or Paolo Dongilli


December  January   February   March   April   May   June  

DateSpeakerAffiliationTitleAbstract
December, 16th Paolo Dongilli KRDB, FUB Rule-based part-of-speech tagging Abstract
Winter Break
January, 13rd Judith Knapp EURAC The application of computational linguistics tools for computer assisted language learning: Experiences with WordManager Abstract
January,27th Marcello Federico ITC-IRST The Statistical Approach to Machine Translation Abstract
Feburary, 10th Gianni Lazzari ITC-IRST Spoken Language technologies: from research to market Abstract
February, 24th Emanuele Pianta ITC-IRST Natural Language Generation Abstract
March, 10th Diego Giuliani and Fabio Brugnara ITC-IRST Automatic Speech Recognition Abstract
March, 24th Bernardo Magnini ITC-IRST Web as a Corpus Abstract
April, 7th Fabio Pianesi ITC-IRST User-centred design of an affective computing interaction paradigm for an adaptive multimedia mobile guide for museums Abstract
April, 21st Luca Dini CELI s.r.l. Information Extraction: Market Demand and Technological Syncretism Abstract
May, 26th Alessandro Lenci Università di Pisa Lexical Resources: design and acquisition Abstract
June, 9th Guenther Neumann LT-Lab DFKI Cross-lingual Open-domain Question-Answering from Unstructured Texts Abstract
June, 16th Fabio Pianesi ITC-IRST Ethnography of small groups for the design of a co-located multimodal system to support meetings Abstract



December ^


16th December, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Rule-based part-of-speech tagging
Paolo Dongilli, KRDB, FUB


Part-of-speech tagging is the area of Computational Linguistics that deals with the automatic assignment of appropriate grammatical descriptors to words in a text. There are various approaches to tagging and usually statistical techniques have been more successful than rule-based methods. The aim of this seminar is to present a simple rule-based part-of-speech tagger showing how it automatically acquires its rules and tags with accuracy comparable to stochastic taggers.

January ^


13rd January, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

The application of computational linguistics tools for computer assisted language learning: Experiences with WordManager
Judith Knapp, EURAC


Computer-assisted language learning (CALL) is a research field which explores the use of computational methods and techniques for language learning and teaching. Recently, an increasing number of language learning systems have been developed which adopt Computer Linguistics (CL) tools. I will start with a short review about how these tools are used and to what extend it has been successful. Then I will present the ELDIT program. The ELDIT project started at the European Academy of Bolzano in 1999. ELDIT is an electronic learners' dictionary for the Italian and German languages and includes also the texts for the exams in bilingualism. Adding interactive exercises and a tandem learning feature is planned for the future. ELDIT includes CL tools to provide educational content in an innovative and meaningful way. In particular I will speak about WordManager, a reusable morphological database developed at the University of Basel. Due to WordManager we can for instance provide the entire inflection paradigm (declination or conjugation) for each dictionary entry. Realizing such features requires a close co-operation between linguists, computational linguists, language teachers and computer scientists. I will conclude our talk by outlining some experiences about this interdisciplinary collaboration.


27th January, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

The Statistical Approach to Machine Translation
Marcello Federico, ITC-IRST


Machine Translation (MT) is one of the oldest and most ambitious challenges taken on by computer science. In this lecture, I will introduces the problem of MT and review the history and main approaches developed during the last 50 years. Then, I will focus attention on the currently most successful approach, which is based on statistical models developed in the early 90's at IBM. Statistical MT exploits so called word-alignment models which can be automatically trained from parallel corpora, i.e texts provided with human translations. Finally, I will overview performance evaluation methods for MT and report results of recent international evaluation campaigns.

February ^


10th February, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Spoken Language technologies: from research to market
Gianni Lazzari, ITC-IRST


Spoken language technologies are already used in very useful applications as automatic telephon services, dictation machines and data entry for medical and professional reporting New technology is almost ready for affording more challenging applications like broadcast news trascriptions, and spoken document access.

In the talk some of the forementioned applications will be presented. Moreover the typical problems encountered and the methods used during the development of the application together with the key roles played in the business chain, will be discussed.




24th February, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Natural Language Generation
Emanuele Pianta, ITC-IRST


After a short overview on the main issues at stake in the field of automatic Natural Language Generation (NLG), we will introduce a more specific topic, that is how to enforce robustness in NLG systems. Robustness is usually not considered as an issue for NLG. This is explained by the fact the, unlike what happens with natural language analysis systems, e.g parsers, the input to NLG systems is supposed to be some formal and well-formed representation, such as a semantic representation or the content of a data-base or knowledge-base. We will show that this assumption may be wrong for a generation component embedded in an interlingua-based machine translation system. We will illustrate how the robustness issue is tackled by XIG, an Interlingua-to-Italian generation component used in the NESPOLE! machine translation project

March ^


10th March, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Automatic Speech Recognition
Diego Giuliani, Fabio Brugnara, ITC-IRST


The seminar will give an overview of the different aspects of automatic speech recognition, from signal processing to probabilistic modeling and decoding. After posing the problem, motivating the statistical approach, the overall architecture of a speech recognition system will be presented. For each components, the basic elements will be described, so as to clarify its role in a complete speech recognition system.

Most popular methods adopted for modeling and decoding the speech signal will be briefly introduced together with some elements of phonetics. In particular, some principles of acoustic modeling, based on Hidden Markov Models, and statistical language modeling will be presented.




24th March, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Web as a Corpus
Bernardo Magnini, ITC-IRST


The Web is an immense, multilingual, freely available corpus. As with other large new corpora, computational linguists have been stimulated by its presence.

Recently there has been a growing interest in techniques that take advantage of Web redundancy for automatically extracting linguistic information in order to model language variability. As an example, approaches have been proposed to collect, on a large scale, topic signatures, examples for word senses, entailment relations and paraphrases. In addition, the Web has been exploited as a huge repository of world knowledge: named entities, semantic relations among entities, facts about a certain topic are first mined and then organized and made available for further processing.

The seminar will overview of the main approaches that use the Web as a corpus for different aims in computational linguistics. Both positive experiences and open problems will be addressed.



April ^


7th April, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

User-centred design of an affective computing interaction paradigm for an adaptive multimedia mobile guide for museums
Fabio Pianesi


In this seminar, we will present a case-study of the user-centered design of a personalized mobile guide. The design involved a team which comprises graphical designers, engineers and psychologists. A first attitudinal study was performed to assess the impact of some adaptivity dimensions for a mobile guide in a museum setting, namely the location awareness, the degree of control over follow-ups, the content adaptation with respect to user interests and the content adaptation with respect to history of interaction. The study involved forty subjects exposed to simulations of different situations in a museum setting and then asked to score the simulated systems according to a number of dimensions. The scores were then correlated with some personality's traits of the subjects with the aim of assessing which one of these traits influence the acceptance or the rejection of the different dimensions of adaptivity. Then, a first prototype of a personalized guide was designed and implemented. The first stage of user evaluation revealed the particular implementation of the degree of control over follow-ups dimension was misleading. A small user study was therefore run in order to inform the subsequent re-design phase. A technique called Action-retrospection protocol consisting in an interview with the subject while using the system was used. The re-design of the interface brought to a completely new paradigm based on the broad notion of affective interaction. We posed that an interaction based on expressing affective attitude toward the service provided by the system may improve usability of an interface in particular when, like in museums, the technology should not hinder the "real" experience. This paradigm required the design and testing of a new widget, called the like-o-meter During the re-design phase a number of other small studies using these technique were performed. Finally, a new prototype was implemented a larger user study has now been started.


21st April, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Information Extraction: Market Demand and Technological Syncretism
Luca Dini, CELI s.r.l.


The talk will describe the basic goals of the wide range of technologies that are nowadays classified under the label of "Information Extraction". It will become evident that what characterise this technology is not a set of algorithms or techniques, but rather the market demand for the transition from unstructured to structured information, i.e. from textual documents to some relational representation of certain concepts. Under this view, we will show how in real-life applications the traditional finite state automata and finite state transducers framework is usually enriched by "external" technologies ranging from deep grammar processing to statistical methods.

May ^


26th May, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Lexical Resources: design and acquisition
Alessandro Lenci, CNR, Pisa


In language technologies, the task of providing the basic semantic description of words is entrusted to lexical resources, which aim at making word content machine-understandable. Since meanings live and arise in linguistic contexts, it is necessary to take into account how semantic information emerges from the actual textual data, and how the latter contribute to meaning formation and change. Consistently, computational lexicons can not only be conceived as repositories of semantic descriptions, but rather at most as core set of meanings that need to be customized, tuned and adapted to different domains, applications and texts. The talk will discuss the impact of these issues on the design and development of lexical resources. A special emphasis will be put on the connections between computational lexicons, terminology repositories and ontologies, as well on automatic methods to acquire lexical information from texts"


June ^


9th June, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Cross-lingual Open-domain Question-Answering from Unstructured Texts
Guenther Neumann, LT-Lab DFKI


Domain-open Question-Answering (ODQA)-systems is a very active research field in which methods and algorithms from different areas like Information Retrieval (IR), Information Extraction (IE) and Natural Language Processing (NLP) are combined in novel ways. ODQA-systems receive NL-questions as input (and not keywords), analyse huge sets of free texts and return exact answers as output (and not documents). Recently, research in cross-lingual ODQA is emerging as a new topic, i.e., the development of ODQA-systems that receive NL-queries in one language (e.g., German) and extract exact answers in documents of another language (e.g., English). In this talk I will give a brief overview of the major scientific questions and challenges of cross-lingual ODQA. I will then describe the methods and technologies that have been developed at the LT-lab of the DFKI, viz. robust NL-query analysis in open-domains, hybrid methods for the translation of NL-queries and query expansion, and strategies for the multi-dimensional annotation of large sources of unstructured texts.


16th June, 16:00-17:00 - Faculty of Computer Science, FUB, Seminar Room (first floor left)

Ethnography of small groups for the design of a co-located multimodal system to support meetings
Fabio Pianesi, ITC-IRST


In this seminar, we will present a case-study of the user-centered design of a personalized mobile guide. The design involved a team which comprises graphical designers, engineers and psychologists. A first attitudinal study was performed to assess the impact of some adaptivity dimensions for a mobile guide in a museum setting, namely the location awareness, the degree of control over follow-ups, the content adaptation with respect to user interests and the content adaptation with respect to history of interaction. The study involved forty subjects exposed to simulations of different situations in a museum setting and then asked to score the simulated systems according to a number of dimensions. The scores were then correlated with some personality's traits of the subjects with the aim of assessing which one of these traits influence the acceptance or the rejection of the different dimensions of adaptivity. Then, a first prototype of a personalized guide was designed and implemented. The first stage of user evaluation revealed the particular implementation of the degree of control over follow-ups dimension was misleading. A small user study was therefore run in order to inform the subsequent re-design phase. A technique called Action-retrospection protocol consisting in an interview with the subject while using the system was used. The re-design of the interface brought to a completely new paradigm based on the broad notion of affective interaction. We posed that an interaction based on expressing affective attitude toward the service provided by the system may improve usability of an interface in particular when, like in museums, the technology should not hinder the "real" experience. This paradigm required the design and testing of a new widget, called the like-o-meter During the re-design phase a number of other small studies using these technique were performed. Finally, a new prototype was implemented a larger user study has now been started.