Distance-based speech segregation in near-field virtual audio displays

Thumbnail Image
Brungart, Douglas S
Simpson, Brian D
Associated Organizations
Organizational Unit
Supplementary to
In tasks that require listeners to monitor two or more simultaneous talkers, substantial performance benefits can be achieved by spatially separating the competing speech messages with a virtual audio display. Although the advantages of spatial separation in azimuth are well documented, little is known about the performance benefits that can be achieved when competing speech signals are presented at different distances in the near field. In this experiment, head-related transfer functions (HRTFs) measured with a KEMAR manikin were used to simulate competing sound sources at distances ranging from 12 cm to 1 m along the interaural axis of the listener. One of the sound sources (the target) was a phrase from the Coordinate Response Measure (CRM) speech corpus, and the other sound source (the masker) was either a competing speech phrase from the CRM speech corpus or a speech-shaped noise signal. When speech-shaped noise was used as the masker, the intelligibility of the target phrase increased substantially only when the spatial separation in distance resulted in an improvement in signal-to-noise ratio (SNR) at one of the two ears. When a competing speech phrase was used as the masker, spatial separation in distance resulted in substantial improvements in the intelligibility of the target phrase even when the overall levels of the signals were normalized to eliminate any SNR advantages in the better ear, suggesting that binaural processing plays an important role in the segregation of competing speech messages in the near field. The results have important implications for the design of audio displays with multiple speech communication channels.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI