How do you measure speech intelligibility?

2019-02-20 14:41

Starting Point

If speech is to be transmitted via communication devices or public address systems, the focus is primarily on optimum intelligibility of the reproduced signal. Everyone is familiar with the situation of not being able to understand the conversation partner well, e.g. at door intercoms or announcements in public transport. A large number of influencing variables can play a role here, transforming a clear speech signal into incomprehensible mess.

The development and installation of such a system therefore raises a number of questions:

Which parameters influence speech intelligibility?
How can the intelligibility of a speech signal be measured?
How can the speech intelligibility of a transmission system be optimized?

Influencing Variables

From their own experience, most people already know the most important parameters that lead to a deterioration in speech intelligibility.

Too low, fluctuating or too high signal level
Loud background noises
Reverberation
Distorted signal
Limited frequency spectrum
Masking effects

Since the subjective perception of the listener often plays a decisive role in acoustics, it is important to be able to apply objective and reproducible evaluation criteria. When considering speech intelligibility, this is possible by measuring the Speech Transmission Index (STI), which is described in DIN EN 60268-16. By means of the STI, the speech transmission quality of a transmission channel can be determined and a prediction for speech intelligibility can be derived.

Description of the Measurement Method

The measurement of the Speech Transmission Index is based on the empirical investigation that speech intelligibility is mainly determined by the level of intensity fluctuations (modulation) of speech signals. In real speech, these fluctuations result from the acoustic separation of sentences, words and phonemes. The stronger the measured modulation, the easier it is to understand the speech signal.

The following graphic shows a simplified STI measurement setup. On the left you can see the measurement of the microphone signal path and on the right the measurement of the loudspeaker.

The STI can be measured directly or indirectly, whereby the methods differ in the type of test signal and its applicability. The direct measurement uses a speech-like noise that is modulated in 7 octave frequency bands from 0.125 - 8 kHz with 14 modulation frequencies in the range 0.63 - 12.5 Hz each. This results in 98 measured values from which the STI can be calculated directly. With the indirect method, the impulse response of the transmission system is measured using a suitable test signal and the STI is derived from this using mathematical methods. In both variants, the result is a numeric value between 0 and 1, which provides direct information about the quality of speech intelligibility.

In addition, the Speech Transmission Index for Public Address systems (STIPA) was introduced as a shortened procedure, which is calculated from only 14 instead of 98 values of a direct measurement. Here, the focus is on shortening the measurement duration, but restrictions such as the occurrence of non-linear distortions and the estimation of female speakers must be taken into account. The Room Acoustical Speech Transmission Index (RASTI) still exists, but it is obsolete and should no longer be used.

Comparison of Indices

	STI	STIPA	RASTI
Oktavo bands	7	7	2
Modulation frequencies	14	2	4/5
Combinations	98	14	9
Advantages	high accuracy	high measuring speed	high measuring speed
Disadvantages	slightly increased measuring time	error-prone in case of impulse noises	obsolete measurement method
		sensitive to non-linear distortions	error-prone in case of impulse noises
			error-prone in case of disturbing noises containing tones
			error-prone with compressed signals

As with any qualified measurement, it is important to work with calibrated and sufficiently accurate measuring equipment. Any influences that could have an influence on the measurement result have to be minimized by the selection of the measuring equipment. Low-distortion measuring microphones and loudspeakers with an ideally flat frequency response are a prerequisite for reliable results. The measurement itself is then carried out either computer-based or using an appropriate analyzer.

Experience and optimization

When optimizing communication systems, speech intelligibility is highly dependent on the spatial conditions and the position of the listener. The positioning and orientation of loudspeakers or the targeted use of absorbers and diffusers to improve room acoustics are usually the most important considerations. However, even under controlled acoustic conditions in a low-reflection measuring room, the measurement of the STI can provide useful, device-specific insights. Especially in voice communication systems that are not designed for the sound reinforcement of large rooms, a large number of aspects often play a decisive role. For example, design specifications combined with a wide range of functions and a given economic framework often lead to products whose frequency response and distortion behavior are anything but ideal. By measuring speech intelligibility even with unfavourable values of the standard parameters, we were able to prove that speech intelligibility was surprisingly good, contrary to expectations. The behaviour at different signal-to-noise ratios can also be determined by replaying typical background noises for the intended purpose. But it is also possible to identify weak points or series scattering in combination with other measurements. For example, during a manual assembly process of an intercom, sporadic damage to a loudspeaker could be detected which only became noticeable negatively at high playback levels. Furthermore, it is possible to directly reproduce the effects of unfavorably parameterized digital signal processing steps or changes to the device design, material or components and to optimize speech intelligibility through targeted parameterization, variation of individual components or constructive measures.

Add a comment

Go back

More blogposts

2025-06-18 08:07

Just Three Simple Steps to an Emergency Call System

Anyone can configure an emergency call system from just three components. Find out how in our blog post.

2024-11-13 10:05

Mounting microphones and loudspeakers in intercom systems

Best Practice Tips

There are a number of things to consider when installing microphones and loudspeakers. How do you choose the right ones? What do you need to consider when placing them? Our blog post answers these questions.

2023-06-15 11:02

Efficient Testing of MEMS Microphone Modules

Precise Measurements and Effective Noise Isolation

MEMS microphone modules have become an indispensable part of modern audio technology. They are used in smartphones, wearables or IoT devices. What is the best way to test the quality and reliability of these microphones? Read our technical article on this topic.

Questions?

+49 351 40752650
Send email