Know Your Ears – The Inner Ear

After having followed the path of sound vibration through the outer ear, ear canal and middle ear in the first part of this series, it is time to go on by looking at how vibration is transformed into nerve impulses in the inner ear.

It’s easy to take our hearing for granted and not think any further about what’s involved in turning sound and music into something our brains can actually process. But as we’ve seen already last week, even the first stages that sound passes inside our heads involve some heavy processing which has a great effect on the way we perceive sound. And at this stage, we are still dealing with a simple one-dimensional signal encoded in physical vibration. So let’s make things complicated now and follow the process of turning vibrations into nerve impulses.

Structure of the Inner Ear

The inner ear is actually two organs in one, serving two very different purposes. The one half is the vestibular organ, which is our internal accelerometer and thus responsible for our sense of balance. In the picture above its the part to the left, with its three characteristic liquid-filled canals that are involved in measuring movement and orientation.

The interesting part for us right now is the other half, the cochlea. It’s the snail-like structure to the right. It also consists of some liquid-filled canals, which are actually connected to the canals of the vestibular organ. But other than sharing the same liquids, the vestibular organ plays no known role in hearing.

What’s wound up there to form the cochlea is a set of three parallel canals from the base to the so-called apex, where the structure ends. The innermost canal is the scala vestibuli, which is connected to the oval window, where the vibration from the middle ear is induced to the cochlea. At the apex, the scala vestibuli is connected to the outermost canal, the scala tympani. Following it back to the apex we get to the round window, which serves as a termination.

Between the scala vestibuli and the scala tympani is a third canal, the scala media. This one is not connected to the other two canals and contains a different type of liquid. The two types of liquid have a slightly different electrical charge due to their chemical composition. This voltage difference of about 80mV is what actually “powers” the cochlear nerves.


Here’s a crossection of the cochlea that shows the scala media and the membranes that separate it from the rest. The important one is the basilar membrane at the bottom, which also carries the organ of Corti, where the actual sensory cells are attached that produce nerve impulses in response to vibration of the basilar membrane.

The Basilar Membrane as a Filter Bank

The most important part to understand is the role of the basilar membrane. Along the way from base to apex, its width changes from small to large, which has an interesting effect on the way it responds to the sound pressure waves in the scala tympani. Each point along the membrane resonates most strongly with a specific frequency. Near the base it is most strongly excited by high frequencies, whereas low frequencies result in the largest vibration amplitudes near the apex. The vibration amplitude for a specific excitation frequency decreases with distance to the characteristic location for this frequency.

This image shows the amplitude along the basilar membrane for different excitation frequencies (shown in kHz). Note that the amplitude decrease is much steeper towards the apex as it is towards the base.


In this way, the basilar membrane acts as a filter bank that decomposes sound into its frequency components. The frequency resolution and masking effects observable in hearing tests are directly related to this behavior. Especially the latter is intuitively clear: consider two tones of different frequencies and amplitudes. The second tone must be loud enough to excite the basilar membrane beyond the excitation that the first tone already creates. Also note that this masking effect is stronger towards frequencies higher than that of the first tone.

The Organ of Corti

Now is the time to actually turn the vibration of the basilar membrane into nerve impulses. This happens in the organ of Corti, which sits right on the basilar membrane. It contains the so-called hair cells which sense any vibrations and fire electric pulses when excited. Here’s a close-up of a section of this organ.


Above the hair cells sits the tectorial membrane (not shown here, but in the other picture above). The whole Corti organ vibrates together with the basilar membrane and the tectorial membrane “tickles” the hair cells that fire nerve impulses in return.

But to make a short story long, not all hair cells are the same. There is an important difference between the outer and inner hair cells. Most of the outer hair cells (the ones in the middle) do not transmit nerve impulses, but receive them and stretch a little bit in response. Thus, they are able to actually induce vibration into the whole system.

The inner hair cells (to the far left) in turn are mostly producing nerve impulses and thus are responsible for what is actually “heard”.

But why introduce artificial vibration into the basilar membrane? The outer hair cells form a kind of positive feedback that has two important consequences. First, small vibration amplitudes are amplified, resulting in an increased sensitivity and dynamic range of hearing. Second, this feedback effect increases the sharpness of the basilar membrane filtering. With a “dead” ear, the tuning curves shown above wouldn’t be as steep and thus the frequency resolution would be much coarser.

Hair Cell Response

The final part of the puzzle is now how the inner hair cells respond to vibration. In short, upon movement of the cells, electrical charges accumulate until a threshold is reached, which leads to the charge being released as an electrical impulse. The larger the amplitude of the movement, the more frequent these impulses are fired. This way, sound intensity is encoded as the number of nerve impulses per time.

There is an important and very interesting thing about when these nerve firings are triggered. At low excitation frequencies, the hair cells have more time to rest between each wave cycle, and they preferably respond to a rising movement of the basilar membrane. As a result, the nerve impulses occur mostly phase-synchronously.

At higher frequencies, the hair cells are more constantly excited and thus react more in a “fire when ready” mode, without respect to phase. The transition between both modes happens at around 2-3 kHz. As a result, precise phase information is only preserved for lower frequencies.

Take A Breath

Here ends the turbocharged tour from vibration to nerve impulses. As you’ve seen, there’s a lot of phenomena involved in providing our sense of hearing. The most important parts being the basilar membrane filter bank, which acts similar to a realtime spectrum analyzer. This is actually why the latter can be such a useful tool in the studio.

But another surprising finding is the positive feedback via the outer hair cells. It can be described as a kind of multiband compression that increases the perceivable dynamic range by as much as 40 dB. If this mechanism is broken (e.g. due to excessive sound pressure), the threshold of hearing increases. So this is a good time to remind you of the hearing protection you should wear in the rehearsal room!

I know this was a tough one, and I surely rushed through some aspects to keep the article focused on the phenomena that have the most impact on the way we perceive sound. As always, if anything is unclear, feel free to ask in the comments!