Introduction

The human beings perceive sound in three dimensions. Localization of sound depends on the way the sound waves from the same source differ from each other as they reach the left and right ear. The head, torso, shoulders and the outer ears modify the sound arriving at a person’s ears. This modification can be described by a complex response function – the Head Related Transfer Function (HRTF).

HRTFs can be used to generate binaural sound. Theoretically, HRTFs contain all the information about the sound source’s location (its direction and distance from the listener). If properly measured and implemented, HRTFs can generate a “virtual acoustic environment”.

The study of HRTFs is a rapidly growing area with potential uses in virtual environments, auditory displays, entertainment industry, human-computer interface for visually impaired, aircraft warning systems and many others.

 

Problems with HRTF:

Measuring HRTF’s can be expensive. A typical set up requires an anechoic chamber and high quality audio equipment like speakers and headphones. To take this technology to the masses, generic HRTFs have been used, but they do not work as well as individualized HRTFs. Once measured, HRTFs are convolved with the sound to give it a direction. Depending on the size of these functions, the cost of computing equipment can rise significantly.

There is much to be learned about HRTFs. Even the most carefully taken measurements suffer from the “cones of confusion” and “inside the head” effects. Range cues are poorly understood. It is possible to add a room transfer function to give the effect of distance, but such filters are not flexible, i.e. one cannot obtain a “whisper” effect using the room transfer function.

Sound Localization Research at the FIU DSP Lab:

The DSP Lab at FIU has available an AUSIM HeadZap system to measure individual HRTFs. This system measures 128-point impulse responses of sounds generated by Golay Codes. We measure 72 pairs of HRTFs for every individual (12 azimuths and 6 elevations) which can be analyzed to find out how the HRTF changes from person to person. So far, data from 40 individuals are being analyzed and we have found some interesting patterns (see below for papers).

Using MATLAB, the HRTFs are convolved with a monoaural sound to produce binaural signals. To measure the accuracy of localization, a GUI is used to test about 20 individuals. More testing may reveal how the ear shape affects localization ability. Once a pattern is established, the HRTFs can be modified to make them more effective. We are currently pursuing the identification of the pattern of frequency attenuation and enhancement that seems to depend on the protrusion angle of the pinna.

Analysis of individual HRTFs may also reveal how it changes with the shape of the head, torso and pinna. Using this information, one may be able to individualize generic transfer functions. We are looking at building such a model based on a spherical-head model (the most basic model). We plan to add a pinna reflection model to this and finally a pole/zero model to make it increase the accuracy of the overall model.

Papers

1) Evaluation of digital sound spatialization accuracy over commodity audio channels in a personal computer

Omar Grafals, Navarun Gupta, Gualberto Cremades, Barreto, A. B., and Malek Adjouadi.
Proceedings of the 1999 Computing Research Conference, University of Puerto Rico – Mayaguez, December 4, 1999, Mayaguez, PR., pp. 5-8.

2) Decreased 3-D sound spatialization accuracy caused by speech bandwidth limitations over commodity audio components.

Omar Grafals, Navarun Gupta, Gualberto Cremades, Barreto, A. B., and Malek Adjouadi. Biomedical Sciences Instrumentation, Vol.36, April 2000, pp. 245 – 250.

3) The effect of pinna protrusion angle on localization of virtual sound in the horizontal plane

Some Results from the work reported in (1) and (2), above:

Research_007_01.gif (4068 bytes)

Testing the accuracy of virtual sound localization within the front hemisphere, at 0 elevation, using generic HRTFs, we found that the accuracy of localization decreases significantly outside the range from –45 to +45 degrees in azimuth. This main effect was observed for the spatialization of both, a speech signal and broadband noise.

Some Results from the work reported in (3) above:

Chart shows how a subject’s resolution of front/back confusion improves with the prototype HRTF (with large protrusion angle).

Related Web Links

1- http://sound.media.mit.edu/KEMAR.html
2- http://www.cc.gatech.edu/gvu/multimedia/spatsound/spatsound.html
3- http://www-engr.sjsu.edu/~duda/Duda.Research.html
4- http://www.waisman.wisc.edu/hdrl/index.html
5- http://www.pa.msu.edu/acoustics/

Contact Us

Digital Signal Processing Laboratory
Center for Engineering and Applied Sciences (CEAS-Room 3970N)
10555 West Flagler Street Miami, Florida, 33174, USA

Inquiries: barreto@eng.fiu.edu | Phone: (305) 348-3711