This talk provides an overview of the technology, development, and human aspects of cross-lingual machine mediated communication. The world today is emerging as a truly global society. One of the foremost challenges in such a society is the ability to communicate freely and effectively across linguistic and cultural barriers.
The importance of overcoming these barriers is especially high in critical application environments such as providing health care, and conducting relief and peacekeeping missions. For instance, in the public health domain, a number of studies have demonstrated that good quality human translation and interpretation can improve healthcare access and delivery; this also includes helping people understand their contractual obligations, and, above all, the informed consent. In-fact, Federal and State Laws mandate that healthcare providers and hospitals provide language access.
Unfortunately, human resources for in-person or phone-based interpretation are typically not easily available, tend to be financially prohibitive or raise privacy issues. Advances in automatic multilingual speech recognition, translation and synthesis technologies promise the creation of transformative, possibilities for cost-effective, widely deployable and always available solutions for enabling and enhancing improved communication between people who do not share a common language. The first Speech-to-Speech (S2S) systems to enable such communication are already well under development.
In this talk I will briefly talk about the design of a S2S system, its component technologies, their limitations, and highlights of research towards overcoming these limitations. I will touch on a variety of relevant aspects including linguistic efforts and human factors work needed in the design of the system.
Panayiotis (Panos) Georgiou is a Research Assistant Professor in Electrical Engineering at the University of Southern California. He received his B.A. and M.Eng degrees with Honors from Cambridge University (Pembroke College), U.K. in 1996 and his MSc and PhD degrees from the University of Southern California in 1998 and 2002 respectively.
His interests span the fields of Human Social and Behavioral Signal Processing. He has worked on and published over 60 papers in the fields of statistical signal processing, alpha stable distributions, speech and multimodal signal processing and interfaces, speech translation, language modeling, immersive sound processing, sound source localization, and speaker identification. He has been an Investigator, and co-PI on several federally funded projects notably including the DARPA Transtac "SpeechLinks" and the NSF (Large) "An Integrated Approach to Creating Enriched Speech Translation Systems". He is currently serving as guest editor of the Computer Speech and Language journal.
Panos has expertise in a diverse, relevant range of topics from Multimodal Signal Processing to User Modeling. His current emphasis is on speech-to-speech translation and user sensing and analysis for behavioral signal processing with a specific focus on observational methods and analytics for psychology.