Noise in everyday life is a significant and growing problem that, according to the World Health Organisation, is responsible for many healthy years of life lost. And it is not just traffic,mobile devices or so-called personal music equipment that is responsible, but speech itself. Anyone unfortunate enough to spend much time at a public transport interchange or airport is assailed by loud, distorted and repetitive announcements. Keeping the volume high is the easy way to ensure that anyone listening gets the message, especially in noisy environments, but is it the smartest solution?
Noise in everyday life is a significant and growing problem that, according to the World Health Organisation, is responsible for many healthy years of life lost [1]. And it is not just traffic,mobile devices or so-called personal music equipment that is responsible, but speech itself. Anyone unfortunate enough to spend much time at a public transport interchange or airport is assailed by loud, distorted and repetitive announcements. Keeping the volume high is the easy way to ensure that anyone listening gets the message, especially in noisy environments, but is it the smartest solution? Researchers at the Institute of Computer Science at the Foundation for Research and Technology – Hellas (FORTH) have discovered that by manipulating speech before it is sent to the loudspeakers it is possible to preserve intelligibility at much lower volume levels than before. They suggested a novel two stages approach to redistribute the energy of the speech signal over time and frequency in such a way so to protect the most valuable information of speech in noise. Firstly, the redistribution of signal energy over frequency respects observations of what human talkers do when faced with noise or when attempting to speak clearly, for example, to listeners with hearing impairment. At the second stage, the algorithm protects areas of speech which are vulnerable to noise by boosting their energy, while it reduces the energy from the most sonorant parts of speech (these parts are already well protected from noise). In this way the global energy of the signal before and after manipulation remains the same. However, the manipulated signal sounds by far more intelligible than the original non-manipulated signal. A recent global evaluation –whose results will be published in summer – found very significant increases in the number of words identified correctly in manipulated sentences compared to those whose content was untouched. The improvement in intelligibility in some cases is equivalent to turning down the volume by more than 5 decibels. This research is part of a collaborative project called “The Listening Talker”, funded by the European Union's Future and Emerging Technology programme, involving scientists in Spain, Greece, Sweden and the UK. The techniques developed in the Listening Talker project work with both natural and synthetic speech and it is expected that as well as public address systems, many devices of the future which produce speech output (e.g. mobile phones, radios, in-car navigation systems) will benefit from the project's findings. The FORTH-ICS research team that worked in this project under the supervision of Professor Yannis Stylianou comprised: Varvara Kandia (MSc), Dr. Elizabeth Godoy, and PhD candidate Maria Koutsogiannaki. [1] World Health Organisation (2011). “Burden of disease from environmental noise: Quantification of healthy life years lost in Europe”.