HOME

TheInfoList



OR:

Sound localization is a listener's ability to identify the location or origin of a detected
sound In physics, sound is a vibration that propagates as an acoustic wave, through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, sound is the ''reception'' of such waves and their ''perception'' by ...
in direction and distance. The sound localization mechanisms of the mammalian
auditory system The auditory system is the sensory system for the sense of hearing. It includes both the sensory organs (the ears) and the auditory parts of the sensory system. System overview The outer ear funnels sound vibrations to the eardrum, increasin ...
have been extensively studied. The auditory system uses several cues for sound source localization, including time difference and level difference (or intensity difference) between the ears, and spectral information. These cues are also used by other animals, such as birds and reptiles, but there may be differences in usage, and there are also localization cues which are absent in the human auditory system, such as the effects of ear movements. Animals with the ability to localize sound have a clear evolutionary advantage.


How sound reaches the brain

Sound is the perceptual result of mechanical vibrations traveling through a medium such as air or water. Through the mechanisms of compression and rarefaction, sound waves travel through the air, bounce off the pinna and concha of the exterior ear, and enter the ear canal. The sound waves vibrate the tympanic membrane ( ear drum), causing the three bones of the middle ear to vibrate, which then sends the energy through the
oval window The oval window (or ''fenestra vestibuli'' or ''fenestra ovalis'') is a membrane-covered opening from the middle ear to the cochlea of the inner ear. Vibrations that contact the tympanic membrane travel through the three ossicles and into the in ...
and into the
cochlea The cochlea is the part of the inner ear involved in hearing. It is a spiral-shaped cavity in the bony labyrinth, in humans making 2.75 turns around its axis, the modiolus. A core component of the cochlea is the Organ of Corti, the sensory o ...
where it is changed into a chemical signal by hair cells in the
organ of Corti The organ of Corti, or spiral organ, is the receptor organ for hearing and is located in the mammalian cochlea. This highly varied strip of epithelial cells allows for transduction of auditory signals into nerve impulses' action potential. Trans ...
, which
synapse In the nervous system, a synapse is a structure that permits a neuron (or nerve cell) to pass an electrical or chemical signal to another neuron or to the target effector cell. Synapses are essential to the transmission of nervous impulses from ...
onto spiral ganglion fibers that travel through the
cochlear nerve The cochlear nerve (also auditory nerve or acoustic nerve) is one of two parts of the vestibulocochlear nerve, a cranial nerve present in amniotes, the other part being the vestibular nerve. The cochlear nerve carries auditory sensory information ...
into the brain.


Neural interactions

In
vertebrate Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, with ...
s, interaural time differences are known to be calculated in the superior olivary nucleus of the
brainstem The brainstem (or brain stem) is the posterior stalk-like part of the brain that connects the cerebrum with the spinal cord. In the human brain the brainstem is composed of the midbrain, the pons, and the medulla oblongata. The midbrain is ...
. According to Jeffress, this calculation relies on delay lines:
neuron A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa ...
s in the superior olive which accept innervation from each ear with different connecting
axon An axon (from Greek ἄξων ''áxōn'', axis), or nerve fiber (or nerve fibre: see spelling differences), is a long, slender projection of a nerve cell, or neuron, in vertebrates, that typically conducts electrical impulses known as action p ...
lengths. Some cells are more directly connected to one ear than the other, thus they are specific for a particular interaural time difference. This theory is equivalent to the mathematical procedure of
cross-correlation In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a ''sliding dot product'' or ''sliding inner-product''. It is commonly used f ...
. However, because Jeffress's theory is unable to account for the precedence effect, in which only the first of multiple identical sounds is used to determine the sounds' location (thus avoiding confusion caused by echoes), it cannot be entirely used to explain the response. Furthermore, a number of recent physiological observations made in the midbrain and brainstem of small mammals have shed considerable doubt on the validity of Jeffress's original ideas. Neurons sensitive to interaural level differences (ILDs) are excited by stimulation of one ear and inhibited by stimulation of the other ear, such that the response magnitude of the cell depends on the relative strengths of the two inputs, which in turn, depends on the sound intensities at the ears. In the auditory midbrain nucleus, the
inferior colliculus The inferior colliculus (IC) ( Latin for ''lower hill'') is the principal midbrain nucleus of the auditory pathway and receives input from several peripheral brainstem nuclei in the auditory pathway, as well as inputs from the auditory cortex. Th ...
(IC), many ILD sensitive neurons have response functions that decline steeply from maximum to zero spikes as a function of ILD. However, there are also many neurons with much more shallow response functions that do not decline to zero spikes.


The cone of confusion

Most mammals are adept at resolving the location of a sound source using interaural time differences and interaural level differences. However, no such time or level differences exist for sounds originating along the circumference of circular conical slices, where the cone's axis lies along the line between the two ears. Consequently, sound waves originating at any point along a given circumference
slant height Slant can refer to: Bias *Bias or other non-objectivity in journalism, politics, academia or other fields Technical * Slant range, in telecommunications, the line-of-sight distance between two points which are not at the same level * Slant d ...
will have ambiguous perceptual coordinates. That is to say, the listener will be incapable of determining whether the sound originated from the back, front, top, bottom or anywhere else along the circumference at the base of a cone at any given distance from the ear. Of course, the importance of these ambiguities are vanishingly small for sound sources very close to or very far away from the subject, but it is these intermediate distances that are most important in terms of fitness. These ambiguities can be removed by tilting the head, which can introduce a shift in both the
amplitude The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of am ...
and phase of sound waves arriving at each ear. This translates the vertical orientation of the interaural axis horizontally, thereby leveraging the mechanism of localization on the horizontal plane. Moreover, even with no alternation in the angle of the interaural axis (i.e. without tilting one's head) the hearing system can capitalize on interference patterns generated by pinnae, the torso, and even the temporary re-purposing of a hand as extension of the pinna (e.g., cupping one's hand around the ear). As with other sensory stimuli, perceptual disambiguation is also accomplished through integration of multiple sensory inputs, especially visual cues. Having localized a sound within the circumference of a circle at some perceived distance, visual cues serve to fix the location of the sound. Moreover, prior knowledge of the location of the sound generating agent will assist in resolving its current location.


Sound localization by the human auditory system

Sound localization is the process of determining the location of a
sound In physics, sound is a vibration that propagates as an acoustic wave, through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, sound is the ''reception'' of such waves and their ''perception'' by ...
source. The brain utilizes subtle differences in intensity, spectral, and timing cues to allow us to localize sound sources.Thompson, Daniel M. Understanding Audio: Getting the Most out of Your Project or Professional Recording Studio. Boston, MA: Berklee, 2005. Print. In this section, to more deeply understand the human auditory mechanism, we will briefly discuss about human ear localization theory.


General introduction

Localization can be described in terms of three-dimensional position: the azimuth or horizontal angle, the elevation or vertical angle, and the distance (for static sounds) or velocity (for moving sounds).Roads, Curtis. The Computer Music Tutorial. Cambridge, MA: MIT, 2007. Print. The azimuth of a sound is signaled by the difference in arrival times between the ears, by the relative amplitude of high-frequency sounds (the shadow effect), and by the asymmetrical spectral reflections from various parts of our bodies, including torso, shoulders, and pinnae. The distance cues are the loss of amplitude, the loss of high frequencies, and the ratio of the direct signal to the reverberated signal. Depending on where the source is located, our head acts as a barrier to change the
timbre In music, timbre ( ), also known as tone color or tone quality (from psychoacoustics), is the perceived sound quality of a musical note, sound or tone. Timbre distinguishes different types of sound production, such as choir voices and musica ...
, intensity, and
spectral ''Spectral'' is a 2016 3D military science fiction, supernatural horror fantasy and action-adventure thriller war film directed by Nic Mathieu. Written by himself, Ian Fried, and George Nolfi from a story by Fried and Mathieu. The film stars J ...
qualities of the sound, helping the brain orient where the sound emanated from. These minute differences between the two ears are known as interaural cues. Lower frequencies, with longer wavelengths, diffract the sound around the head forcing the brain to focus only on the phasing cues from the source. Helmut Haas discovered that we can discern the sound source despite additional reflections at 10 decibels louder than the original wave front, using the earliest arriving wave front. This principle is known as the Haas effect, a specific version of the precedence effect. Haas measured down to even a 1 millisecond difference in timing between the original sound and reflected sound increased the spaciousness, allowing the brain to discern the true location of the original sound. The nervous system combines all early reflections into a single perceptual whole allowing the brain to process multiple different sounds at once.Benade, Arthur H. Fundamentals of Musical Acoustics. New York: Oxford UP, 1976. Print. The nervous system will combine reflections that are within about 35 milliseconds of each other and that have a similar intensity.


Duplex theory

To determine the lateral input direction (left, front, right), the
auditory system The auditory system is the sensory system for the sense of hearing. It includes both the sensory organs (the ears) and the auditory parts of the sensory system. System overview The outer ear funnels sound vibrations to the eardrum, increasin ...
analyzes the following ear signal information: In 1907, Lord Rayleigh utilized tuning forks to generate monophonic excitation and studied the lateral sound localization theory on a human head model without auricle. He first presented the interaural clue difference based sound localization theory, which is known as Duplex Theory. Human ears are on different sides of the head, and thus have different coordinates in space. As shown in the duplex theory figure, since the distances between the acoustic source and ears are different, there are time difference and intensity difference between the sound signals of two ears. We call those kinds of differences as Interaural Time Difference (ITD) and Interaural Intensity Difference (IID) respectively.


ITD and IID

From the duplex theory figure we can see that for source B1 or source B2, there will be a propagation delay between two ears, which will generate the ITD. Simultaneously, human head and ears may have a shadowing effect on high-frequency signals, which will generate IID. * Interaural time difference (ITD) – Sound from the right side reaches the right ear earlier than the left ear. The auditory system evaluates interaural time differences from: (a) Phase delays at low frequencies and (b) group delays at high frequencies. * Theory and experiments show that ITD relates to the signal frequency f. Suppose the angular position of the acoustic source is θ, the head radius is r and the acoustic velocity is c, the function of ITD is given by:Zhou X. Virtual reality technique Telecommunications Science, 1996, 12(7): 46-–.ITD= \begin 3\times\text\times\sin\theta/\text, & \textf\leq\text \\ 2\times\text\times\sin\theta/\text, & \textf>\text \end. In above closed form, we assumed that the 0 degree is in the right ahead of the head and counter-clockwise is positive. * Interaural intensity difference (IID) or interaural level difference (ILD) – Sound from the right side has a higher level at the right ear than at the left ear, because the
head shadow A head shadow (or acoustic shadow) is a region of reduced amplitude of a sound because it is obstructed by the head. It is an example of diffraction. Sound may have to travel through and around the head in order to reach an ear. The obstruction c ...
s the left ear. These level differences are highly frequency dependent and they increase with increasing frequency. Massive theoretical researches demonstrate that IID relates to the signal frequency f and the angular position of the acoustic source θ. The function of IID is given by: IID=1.0+(f/1000)^\times\sin\theta * For frequencies below 1000 Hz, mainly ITDs are evaluated ( phase delays), for frequencies above 1500 Hz mainly IIDs are evaluated. Between 1000 Hz and 1500 Hz there is a transition zone, where both mechanisms play a role. * Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.


Evaluation for low frequencies

For frequencies below 800 Hz, the dimensions of the head (ear distance 21.5 cm, corresponding to an interaural time delay of 625 µs) are smaller than the half
wavelength In physics, the wavelength is the spatial period of a periodic wave—the distance over which the wave's shape repeats. It is the distance between consecutive corresponding points of the same phase on the wave, such as two adjacent crests, tr ...
of the sound waves. So the auditory system can determine phase delays between both ears without confusion. Interaural level differences are very low in this frequency range, especially below about 200 Hz, so a precise evaluation of the input direction is nearly impossible on the basis of level differences alone. As the frequency drops below 80 Hz it becomes difficult or impossible to use either time difference or level difference to determine a sound's lateral source, because the phase difference between the ears becomes too small for a directional evaluation.


Evaluation for high frequencies

For frequencies above 1600 Hz the dimensions of the head are greater than the length of the sound waves. An unambiguous determination of the input direction based on interaural phase alone is not possible at these frequencies. However, the interaural level differences become larger, and these level differences are evaluated by the auditory system. Also, delays between the ears can still be detected via some combination of phase differences and group delays, which are more pronounced at higher frequencies; that is, if there is a sound onset, the delay of this onset between the ears can be used to determine the input direction of the corresponding sound source. This mechanism becomes especially important in reverberant environments. After a sound onset there is a short time frame where the direct sound reaches the ears, but not yet the reflected sound. The auditory system uses this short time frame for evaluating the sound source direction, and keeps this detected direction as long as reflections and reverberation prevent an unambiguous direction estimation. The mechanisms described above cannot be used to differentiate between a sound source ahead of the hearer or behind the hearer; therefore additional cues have to be evaluated.


Pinna filtering effect


Motivations

Duplex theory shows that ITD and IID play significant roles in sound localization, but they can only deal with lateral localization problems. For example, if two acoustic sources are placed symmetrically at the front and back of the right side of the human head, they will generate equal ITDs and IIDs, in what is called the cone model effect. However, human ears can still distinguish between these sources. Besides that, in natural sense of hearing, one ear alone, without any ITD or IID, can distinguish between them with high accuracy. Due to the disadvantages of duplex theory, researchers proposed the pinna filtering effect theory. The shape of human pinna is concave with complex folds and asymmetrical both horizontally and vertically. Reflected and direct waves generate a frequency spectrum on the eardrum, relating to the acoustic sources. Then auditory nerves localize the sources using this frequency spectrum.


Mathematical model

These spectrum clues generated by the pinna filtering effect can be presented as a
head-related transfer function A head-related transfer function (HRTF), also known as anatomical transfer function (ATF), is a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, e ...
(HRTF). The corresponding time domain expressions are called the Head-Related Impulse Response (HRIR). The HRTF is also described as the transfer function from the free field to a specific point in the ear canal. We usually recognize HRTFs as LTI systems: H_L=H_L(r,\theta,\varphi,\omega,\alpha)=P_L(r,\theta,\varphi,\omega,\alpha)/P_0(r,\omega) H_R=H_R(r,\theta,\varphi,\omega,\alpha)=P_R(r,\theta,\varphi,\omega,\alpha)/P_0(r,\omega), where L and R represent the left ear and right ear respectively, P_L and P_R represent the amplitude of the sound pressure at the entrances to the left and right ear canals, and P_0 is the amplitude of sound pressure at the center of the head coordinate when listener does not exist. In general, an HRTF's H_L and H_R are functions of source angular position \theta, elevation angle \varphi, the distance between the source and the center of the head r, the angular velocity \omega and the equivalent dimension of the head \alpha.


HRTF database

At present, the main institutes that work on measuring HRTF database include CIPIC International Lab, MIT Media Lab, the Graduate School in Psychoacoustics at the University of Oldenburg, the Neurophysiology Lab at the University of Wisconsin–Madison and Ames Lab of NASA. Databases of HRIRs from humans with normal and impaired hearing and from animals are publicly available.


Other cues for 3D space localization


Monaural cues

The human outer ear, i.e. the structures of the pinna and the external
ear canal The ear canal (external acoustic meatus, external auditory meatus, EAM) is a pathway running from the outer ear to the middle ear. The adult human ear canal extends from the pinna (anatomy), pinna to the eardrum and is about in length and in di ...
, form direction-selective filters. Depending on the sound input direction in the median plane, different filter resonances become active. These resonances implant direction-specific patterns into the frequency responses of the ears, which can be evaluated by the
auditory system The auditory system is the sensory system for the sense of hearing. It includes both the sensory organs (the ears) and the auditory parts of the sensory system. System overview The outer ear funnels sound vibrations to the eardrum, increasin ...
for vertical sound localization. Together with other direction-selective reflections at the head, shoulders and torso, they form the outer ear transfer functions. These patterns in the ear's frequency responses are highly individual, depending on the shape and size of the outer ear. If sound is presented through headphones, and has been recorded via another head with different-shaped outer ear surfaces, the directional patterns differ from the listener's own, and problems will appear when trying to evaluate directions in the median plane with these foreign ears. As a consequence, front–back permutations or inside-the-head-localization can appear when listening to
dummy head recording In acoustics, the dummy head recording (also known as ''artificial head'', ''Kunstkopf'' or ''Head and Torso Simulator'') is a method of recording used to generate binaural recordings. The tracks are then listened to through headphones allow ...
s, or otherwise referred to as binaural recordings. It has been shown that human subjects can monaurally localize high frequency sound but not low frequency sound. Binaural localization, however, was possible with lower frequencies. This is likely due to the pinna being small enough to only interact with sound waves of high frequency. It seems that people can only accurately localize the elevation of sounds that are complex and include frequencies above 7,000 Hz, and a pinna must be present.


Dynamic binaural cues

When the head is stationary, the binaural cues for lateral sound localization (interaural time difference and interaural level difference) do not give information about the location of a sound in the median plane. Identical ITDs and ILDs can be produced by sounds at eye level or at any elevation, as long as the lateral direction is constant. However, if the head is rotated, the ITD and ILD change dynamically, and those changes are different for sounds at different elevations. For example, if an eye-level sound source is straight ahead and the head turns to the left, the sound becomes louder (and arrives sooner) at the right ear than at the left. But if the sound source is directly overhead, there will be no change in the ITD and ILD as the head turns. Intermediate elevations will produce intermediate degrees of change, and if the presentation of binaural cues to the two ears during head movement is reversed, the sound will be heard behind the listener.
Hans Wallach Hans Wallach (November 28, 1904 – February 5, 1998) was a German-American experimental psychologist whose research focused on perception and learning. Although he was trained in the Gestalt psychology tradition, much of his later work explored t ...
artificially altered a sound's binaural cues during movements of the head. Although the sound was objectively placed at eye level, the dynamic changes to ITD and ILD as the head rotated were those that would be produced if the sound source had been elevated. In this situation, the sound was heard at the synthesized elevation. The fact that the sound sources objectively remained at eye level prevented monaural cues from specifying the elevation, showing that it was the dynamic change in the binaural cues during head movement that allowed the sound to be correctly localized in the vertical dimension. The head movements need not be actively produced; accurate vertical localization occurred in a similar setup when the head rotation was produced passively, by seating the blindfolded subject in a rotating chair. As long as the dynamic changes in binaural cues accompanied a perceived head rotation, the synthesized elevation was perceived.


Distance of the sound source

The human auditory system has only limited possibilities to determine the distance of a sound source. In the close-up-range there are some indications for distance determination, such as extreme level differences (e.g. when whispering into one ear) or specific pinna (the visible part of the ear) resonances in the close-up range. The auditory system uses these clues to estimate the distance to a sound source: * Direct/ Reflection ratio: In enclosed rooms, two types of sound are arriving at a listener: The direct sound arrives at the listener's ears without being reflected at a wall. Reflected sound has been reflected at least one time at a wall before arriving at the listener. The ratio between direct sound and reflected sound can give an indication about the distance of the sound source. * Loudness: Distant sound sources have a lower loudness than close ones. This aspect can be evaluated especially for well-known sound sources. * Sound spectrum: High frequencies are more quickly damped by the air than low frequencies. Therefore, a distant sound source sounds more muffled than a close one, because the high frequencies are attenuated. For sound with a known spectrum (e.g. speech) the distance can be estimated roughly with the help of the perceived sound. * ITDG: The Initial Time Delay Gap describes the time difference between arrival of the direct wave and first strong reflection at the listener. Nearby sources create a relatively large ITDG, with the first reflections having a longer path to take, possibly many times longer. When the source is far away, the direct and the reflected sound waves have similar path lengths. * Movement: Similar to the visual system there is also the phenomenon of motion
parallax Parallax is a displacement or difference in the apparent position of an object viewed along two different lines of sight and is measured by the angle or semi-angle of inclination between those two lines. Due to foreshortening, nearby object ...
in acoustical perception. For a moving listener nearby sound sources are passing faster than distant sound sources. * Level Difference: Very close sound sources cause a different level between the ears.


Signal processing

Sound processing of the human auditory system is performed in so-called
critical band In audiology and psychoacoustics the concept of critical bands, introduced by Harvey Fletcher in 1933 and refined in 1940, describes the frequency bandwidth of the "auditory filter" created by the cochlea, the sense organ of hearing within the in ...
s. The
hearing range Hearing range describes the range of frequencies that can be heard by humans or other animals, though it can also refer to the range of levels. The human range is commonly given as 20 to 20,000 Hz, although there is considerable variati ...
is segmented into 24 critical bands, each with a width of 1
Bark Bark may refer to: * Bark (botany), an outer layer of a woody plant such as a tree or stick * Bark (sound), a vocalization of some animals (which is commonly the dog) Places * Bark, Germany * Bark, Warmian-Masurian Voivodeship, Poland Arts, e ...
or 100 Mel. For a directional analysis the signals inside the critical band are analyzed together. The auditory system can extract the sound of a desired sound source out of interfering noise. This allows the listener to concentrate on only one speaker if other speakers are also talking (the
cocktail party effect The cocktail party effect is the phenomenon of the brain's ability to focus one's auditory attention on a particular stimulus while filtering out a range of other stimuli, such as when a partygoer can focus on a single conversation in a noisy room ...
). With the help of the cocktail party effect sound from interfering directions is perceived attenuated compared to the sound from the desired direction. The auditory system can increase the
signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to the noise power, often expressed in de ...
by up to 15  dB, which means that interfering sound is perceived to be attenuated to half (or less) of its actual
loudness In acoustics, loudness is the subjective perception of sound pressure. More formally, it is defined as, "That attribute of auditory sensation in terms of which sounds can be ordered on a scale extending from quiet to loud". The relation of ph ...
.


Localization in enclosed rooms

In enclosed rooms not only the direct sound from a sound source is arriving at the listener's ears, but also sound which has been reflected at the walls. The auditory system analyses only the direct sound, which is arriving first, for sound localization, but not the reflected sound, which is arriving later ( law of the first wave front). So sound localization remains possible even in an echoic environment. This echo cancellation occurs in the Dorsal Nucleus of the Lateral Lemniscus (DNLL). In order to determine the time periods, where the direct sound prevails and which can be used for directional evaluation, the auditory system analyzes loudness changes in different critical bands and also the stability of the perceived direction. If there is a strong attack of the loudness in several critical bands and if the perceived direction is stable, this attack is in all probability caused by the direct sound of a sound source, which is entering newly or which is changing its signal characteristics. This short time period is used by the auditory system for directional and loudness analysis of this sound. When reflections arrive a little bit later, they do not enhance the loudness inside the critical bands in such a strong way, but the directional cues become unstable, because there is a mix of sound of several reflection directions. As a result, no new directional analysis is triggered by the auditory system. This first detected direction from the direct sound is taken as the found sound source direction, until other strong loudness attacks, combined with stable directional information, indicate that a new directional analysis is possible. (see
Franssen effect The Franssen effect is an auditory illusion where the listener incorrectly localizes a sound. It was found in 1960 by Nico Valentinus Franssen (1926–1979), a Dutch physicist and inventor. There are two classical experiments, which are related ...
)


Specific techniques with applications


Auditory transmission stereo system

This kind of sound localization technique provides us the real virtual
stereo system Stereophonic sound, or more commonly stereo, is a method of sound reproduction that recreates a multi-directional, 3-dimensional audible perspective. This is usually achieved by using two independent audio channels through a configuration ...
.Zhao R. Study of Auditory Transmission Sound Localization System University of Science and Technology of China, 2006. It utilizes "smart" manikins, such as KEMAR, to glean signals or use DSP methods to simulate the transmission process from sources to ears. After amplifying, recording and transmitting, the two channels of received signals will be reproduced through earphones or speakers. This localization approach uses electroacoustic methods to obtain the spatial information of the original sound field by transferring the listener's auditory apparatus to the original sound field. The most considerable advantages of it would be that its acoustic images are lively and natural. Also, it only needs two independent transmitted signals to reproduce the acoustic image of a 3D system.


3D para-virtualization stereo system

The representatives of this kind of system are SRS Audio Sandbox, Spatializer Audio Lab and Qsound Qxpander. They use HRTF to simulate the received acoustic signals at the ears from different directions with common binary-channel stereo reproduction. Therefore, they can simulate reflected sound waves and improve subjective sense of space and envelopment. Since they are para-virtualization stereo systems, the major goal of them is to simulate stereo sound information. Traditional stereo systems use sensors that are quite different from human ears. Although those sensors can receive the acoustic information from different directions, they do not have the same frequency response of human auditory system. Therefore, when binary-channel mode is applied, human auditory systems still cannot feel the 3D sound effect field. However, the 3D para-virtualization stereo system overcome such disadvantages. It uses HRTF principles to glean acoustic information from the original sound field then produce a lively 3D sound field through common earphones or speakers.


Multichannel stereo virtual reproduction

Since the multichannel stereo systems require many reproduction channels, some researchers adopted the HRTF simulation technologies to reduce the number of reproduction channels. They use only two speakers to simulate multiple speakers in a multichannel system. This process is called as virtual reproduction. Essentially, such approach uses both interaural difference principle and pinna filtering effect theory. Unfortunately, this kind of approach cannot perfectly substitute the traditional multichannel stereo system, such as
5.1 5.1 surround sound ("five-point one") is the common name for surround sound audio systems. 5.1 is the most commonly used layout in home theatres. It uses five full bandwidth channels and one low-frequency effects channel (the "point one"). Dolb ...
/
7.1 surround sound 7.1 surround sound is the common name for an eight-channel surround audio system commonly used in home theatre configurations. It adds two additional speakers to the more conventional six-channel ( 5.1) audio configuration. As with 5.1 surround s ...
system. That is because when the listening zone is relatively larger, simulation reproduction through HRTFs may cause invert acoustic images at symmetric positions.


Animals

Since most animals have two ears, many of the effects of the human auditory system can also be found in other animals. Therefore, interaural time differences (interaural phase differences) and interaural level differences play a role for the hearing of many animals. But the influences on localization of these effects are dependent on head sizes, ear distances, the ear positions and the orientation of the ears. Smaller animals like insects use different techniques as the separation of the ears are too small.


Lateral information (left, ahead, right)

If the ears are located at the side of the head, similar lateral localization cues as for the human auditory system can be used. This means: evaluation of interaural time differences (interaural phase differences) for lower frequencies and evaluation of interaural level differences for higher frequencies. The evaluation of interaural phase differences is useful, as long as it gives unambiguous results. This is the case, as long as ear distance is smaller than half the length (maximal one wavelength) of the sound waves. For animals with a larger head than humans the evaluation range for interaural phase differences is shifted towards lower frequencies, for animals with a smaller head, this range is shifted towards higher frequencies. The lowest frequency which can be localized depends on the ear distance. Animals with a greater ear distance can localize lower frequencies than humans can. For animals with a smaller ear distance the lowest localizable frequency is higher than for humans. If the ears are located at the side of the head, interaural level differences appear for higher frequencies and can be evaluated for localization tasks. For animals with ears at the top of the head, no shadowing by the head will appear and therefore there will be much less interaural level differences, which could be evaluated. Many of these animals can move their ears, and these ear movements can be used as a lateral localization cue.


Odontocetes

Dolphins (and other odontocetes) rely on echolocation to aid in detecting, identifying, localizing, and capturing prey. Dolphin sonar signals are well suited for localizing multiple, small targets in a three-dimensional aquatic environment by utilizing highly directional (3 dB beamwidth of about 10 deg), broadband (3 dB bandwidth typically of about 40 kHz; peak frequencies between 40 kHz and 120 kHz), short duration clicks (about 40 μs). Dolphins can localize sounds both passively and actively (echolocation) with a resolution of about 1 deg. Cross-modal matching (between vision and echolocation) suggests dolphins perceive the spatial structure of complex objects interrogated through echolocation, a feat that likely requires spatially resolving individual object features and integration into a holistic representation of object shape. Although dolphins are sensitive to small, binaural intensity and time differences, mounting evidence suggests dolphins employ position-dependent spectral cues derived from well-developed head-related transfer functions, for sound localization in both the horizontal and vertical planes. A very small temporal integration time (264 μs) allows localization of multiple targets at varying distances. Localization adaptations include pronounced asymmetry of the skull, nasal sacks, and specialized lipid structures in the forehead and jaws, as well as acoustically isolated middle and inner ears.


In the median plane (front, above, back, below)

For many mammals there are also pronounced structures in the pinna near the entry of the ear canal. As a consequence, direction-dependent resonances can appear, which could be used as an additional localization cue, similar to the localization in the median plane in the human auditory system. There are additional localization cues which are also used by animals.


Head tilting

For sound localization in the median plane (elevation of the sound) also two detectors can be used, which are positioned at different heights. In animals, however, rough elevation information is gained simply by tilting the head, provided that the sound lasts long enough to complete the movement. This explains the innate behavior of cocking the head to one side when trying to localize a sound precisely. To get instantaneous localization in more than two dimensions from time-difference or amplitude-difference cues requires more than two detectors.


Localization with coupled ears (flies)

The tiny parasitic fly '' Ormia ochracea'' has become a
model organism A model organism (often shortened to model) is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the model organism will provide insight into the workin ...
in sound localization experiments because of its unique ear. The animal is too small for the time difference of sound arriving at the two ears to be calculated in the usual way, yet it can determine the direction of sound sources with exquisite precision. The
tympanic membrane In the anatomy of humans and various other tetrapods, the eardrum, also called the tympanic membrane or myringa, is a thin, cone-shaped membrane that separates the external ear from the middle ear. Its function is to transmit sound from the a ...
s of opposite ears are directly connected mechanically, allowing resolution of sub-microsecond time differences and requiring a new neural coding strategy. Ho showed that the coupled-eardrum system in frogs can produce increased interaural vibration disparities when only small arrival time and sound level differences were available to the animal's head. Efforts to build directional microphones based on the coupled-eardrum structure are underway.


Bi-coordinate sound localization (owls)

Most owls are
nocturnal Nocturnality is an animal behavior characterized by being active during the night and sleeping during the day. The common adjective is "nocturnal", versus diurnal meaning the opposite. Nocturnal creatures generally have highly developed sens ...
or
crepuscular In zoology, a crepuscular animal is one that is active primarily during the twilight period, being matutinal, vespertine, or both. This is distinguished from diurnal and nocturnal behavior, where an animal is active during the hours of dayli ...
birds of prey. Because they hunt at night, they must rely on non-visual senses. Experiments by Roger PaynePayne, Roger S., 1962. How the Barn Owl Locates Prey by Hearing. ''The Living Bird, First Annual of the Cornell Laboratory of Ornithology'', 151-159 have shown that owls are sensitive to the sounds made by their prey, not the heat or the smell. In fact, the sound cues are both necessary and sufficient for localization of mice from a distant location where they are perched. For this to work, the owls must be able to accurately localize both the azimuth and the elevation of the sound source.


History

The term 'binaural' literally signifies 'to hear with two ears', and was introduced in 1859 to signify the practice of listening to the same sound through both ears, or to two discrete sounds, one through each ear. It was not until 1916 that
Carl Stumpf Carl Stumpf (; 21 April 1848 – 25 December 1936) was a German philosopher, psychologist and musicologist. He is noted for founding the Berlin School of Experimental Psychology. He studied with Franz Brentano at the University of Würzburg ...
(1848–1936), a German
philosopher A philosopher is a person who practices or investigates philosophy. The term ''philosopher'' comes from the grc, φιλόσοφος, , translit=philosophos, meaning 'lover of wisdom'. The coining of the term has been attributed to the Greek th ...
and
psychologist A psychologist is a professional who practices psychology and studies mental states, perceptual Perception () is the organization, identification, and interpretation of sensory information in order to represent and understand the pre ...
, distinguished between dichotic listening, which refers to the stimulation of each ear with a different stimulus, and diotic listening, the simultaneous stimulation of both ears with the same stimulus. Later, it would become apparent that binaural hearing, whether dichotic or diotic, is the means by which sound localization occurs. Scientific consideration of binaural hearing began before the phenomenon was so named, with speculations published in 1792 by William Charles Wells (1757–1817) based on his research into
binocular vision In biology, binocular vision is a type of vision in which an animal has two eyes capable of facing the same direction to perceive a single three-dimensional image of its surroundings. Binocular vision does not typically refer to vision where an ...
. Giovanni Battista Venturi (1746–1822) conducted and described experiments in which people tried to localize a sound using both ears, or one ear blocked with a finger. This work was not followed up on, and was only recovered after others had worked out how human sound localization works.
Lord Rayleigh John William Strutt, 3rd Baron Rayleigh, (; 12 November 1842 – 30 June 1919) was an English mathematician and physicist who made extensive contributions to science. He spent all of his academic career at the University of Cambridge. A ...
(1842–1919) would do these same experiments and come to the results, without knowing Venturi had first done them, almost seventy-five years later. Charles Wheatstone (1802–1875) did work on optics and color mixing, and also explored hearing. He invented a device he called a "microphone" that involved a metal plate over each ear, each connected to metal rods; he used this device to amplify sound. He also did experiments holding
tuning fork A tuning fork is an acoustic resonator in the form of a two-pronged fork with the prongs ( tines) formed from a U-shaped bar of elastic metal (usually steel). It resonates at a specific constant pitch when set vibrating by striking it agains ...
s to both ears at the same time, or separately, trying to work out how sense of hearing works, that he published in 1827.
Ernst Heinrich Weber Ernst Heinrich Weber (24 June 1795 – 26 January 1878) was a German physician who is considered one of the founders of experimental psychology. He was an influential and important figure in the areas of physiology and psychology during his li ...
(1795–1878) and
August Seebeck August Ludwig Friedrich Wilhelm Seebeck (27 December 1805 in Jena – 19 March 1849 in Dresden) was a scientist at the Technische Universität Dresden. Seebeck is primarily remembered for his work on sound and hearing, in particular with exper ...
(1805–1849) and William Charles Wells also attempted to compare and contrast what would become known as binaural hearing with the principles of binocular integration generally. Understanding how the differences in sound signals between two ears contributes to
auditory processing The auditory cortex is the part of the temporal lobe that processes auditory information in humans and many other vertebrates. It is a part of the auditory system, performing basic and higher functions in hearing, such as possible relations to ...
in such a way as to enable sound localization and direction was considerably advanced after the invention of the stethophone by
Somerville Scott Alison Somerville may refer to: *Somerville College, Oxford, a constituent college of the University of Oxford Places * Somerville, Victoria, Australia * Somerville, Western Australia, a suburb of Kalgoorlie, Australia * Somerville, New Zealand, a subu ...
in 1859, who coined the term 'binaural'. Alison based the stethophone on the
stethoscope The stethoscope is a medical device for auscultation, or listening to internal sounds of an animal or human body. It typically has a small disc-shaped resonator that is placed against the skin, and one or two tubes connected to two earpieces. ...
, which had been invented by René Théophile Hyacinthe Laennec (1781–1826); the stethophone had two separate "pickups", allowing the user to hear and compare sounds derived from two discrete locations.


See also

*
Acoustic location Acoustic location is the use of sound to determine the distance and direction of its source or reflector. Location can be done actively or passively, and can take place in gases (such as the atmosphere), liquids (such as water), and in solids ( ...
*
Animal echolocation Echolocation, also called bio sonar, is a biological sonar used by several animal species. Echolocating animals emit calls out to the environment and listen to the echoes of those calls that return from various objects near them. They use these ...
*
Binaural fusion Binaural fusion or binaural integration is a cognitive process that involves the combination of different auditory information presented binaurally, or to each ear. In humans, this process is essential in understanding speech as one ear may pick u ...
*
Coincidence detection in neurobiology Coincidence detection in the context of neurobiology is a process by which a neuron or a neural circuit can encode information by detecting the occurrence of temporally close but spatially distributed input signals. Coincidence detectors influen ...
* Human echolocation *
Perceptual-based 3D sound localization Perceptual-based 3D sound localization is the application of knowledge of the human auditory system to develop 3D sound localization technology. Motivation and Applications Human listeners combine information from two ears to localize and separ ...
*
Psychoacoustics Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated wi ...
* Spatial hearing loss


References


External links


auditoryneuroscience.com: Collection of multimedia files and flash demonstrations related to spatial hearing




* ttp://highered.mcgraw-hill.com/sites/0070579431/student_view0/chapter11/glossary.html Online learning center - Hearing and Listening
HearCom:Hearing in the Communication Society, an EU research project

Research on "Non-line-of-sight (NLOS) Localisation for Indoor Environments" by CMR at UNSW





An introduction to acoustic holography

An introduction to acoustic beamforming
{{Neuroethology Acoustics Neuroethology Hearing Sound Spatial cognition