# Quantization Noise and Bit Depth

In this early stage in the life of The Science of Sound, I’d like to cover some of the basics especially of digital audio, as these are a recurring source of confusion in many discussions. Today we’ll start with covering the „vertical“ dimension of digital audio: quantization noise and bit depth.

Ah, digital audio. Pretty much every controversial debate about audio I came across in my life somehow revolves about the analog vs. digital debate. However, taking part in such debates while studying the mathematical foundations of digital signal processing was very healthy for me. As the audio community is very creative in finding reasons why digital audio is in its very foundation totally messed up, I had the chance to take up on all the skepticism and dig really deep to find out what works and what doesn’t. Let me get one thing straight right from the beginning: digital audio is not in its very foundation totally messed up. But it is awfully counterintuitive and confusing in a lot of ways, and there’s a lot of ways it can go wrong. By the way, it’s absolutely no less counterintuitive and confusing than electronics. However, let’s relieve at least a little bit of the confusion.

### Fixed Point Quantization

To represent a continuous signal in a computer, we need to chop it up in a finite number of pieces that we can then represent as bits and bytes in memory. As an audio signal is continuous in both time and level dimensions, we need to chop it up two times. The first step is chopping up the time axis by sampling, which will be covered in a future article. The next step is chopping up the level, which is done by a process called quantization.

The process is actually pretty simple. Say we have an audio signal that reaches at most the +-1 Volt levels, we need to measure the voltage at each sampling interval and assign the level we get a number. As mentioned, we can only represent a finite number of different levels. Using N bits to represent our levels, we have a pool of $latex 2^N$ numbers to choose from. Our signal has a peak to peak voltage of 2 Volts, so if we decide to use 16 Bits we could represent this voltage range at a step resolution of ca. $latex 2 \mathrm{V} / 2^16 = 0.03 \mathrm{mV}$. That’s 65536 steps. With every further bit, the number of steps doubles.

### Quantization Noise

Of course, the quantization to discrete steps introduces an error, which is at any sample time the difference between the actual level and the quantized level. You can think of it as an error signal that is added to the original signal, which is called quantization noise. To find out how big the problem is, we need to calculate the level of this noise. I’ll skip the derivation because I want to keep formulas at a minimum on The Science of Sound, but you can compute the signal to noise ratio, which is the level of the noise relative to the signal level, using this simple formula:

\mathrm{SNR} = N * 6.02 \mathrm{dB} + 4.77 \mathrm{dB} - \mathrm{crest}

Where „crest“ denotes the crest factor, which is the difference between peak amplitude level and average power level of the signal. This assumes a normalized signal where the peak level is 0 dBFS. For a pure sine, the crest factor is 3.01 dB, for music typically around 12 dB. Note that there are a couple of different variants of this formula, mostly assuming a full scale sine signal. I like the above definition more because it also works for practically relevant signals. Anyway, for music quantized at 16 Bits we’d get an SNR of around 89 dB. This is fairly nice, but a bit tight. Assume we play back the music so that the average level of the recording translates to a sound pressure of around 83 dB SPL, as is recommended because around that loudness range the audible frequency range is best (more on that later). That means that the quantization noise level is still 6 dB below the threshold of hearing in this case. Just about enough.

### A Quick Overview On Dithering

Now we only looked at the average level so far, but we still don’t know how it sounds. And that is a bit of a problem, because the quantization noise is not just noise, it depends on the signal. The result of that is that the noise power concentrates at frequencies where there is signal and at harmonics thereof, especially with small bandwidth sounds like a sine or a decaying piano tone. If the signal is soft enough, these frequencies can become audible and sound similar to harmonic distortion. But there is a solution to this problem called dithering, you probably heard of it. It is often stated that dithering means just adding noise to mask the quantization noise, which is not true. Dithering is indeed done by adding random number just below the smallest quantization level *before* quantization. The does look like white noise, but a more accurate description is that the quantization noise power that may be concentrated at some frequencies is *redistributed over the full spectrum*, making the peak level in the noise spectrum smaller.

Dithering is a whole topic of itself, but I should note at this point that there are several ways to do it. They differ in the way the final dither noise spectrum is shaped. The simplest method results in a flat white quantization noise. More advances methods shape the noise spectrum to distribute more of the noise energy to frequency regions where the ears are less sensitive. Thus further increasing the effective SNR in the most sensitive frequency regions.

Now we already saw that for standard CD audio with 16 Bits we can achieve a signal to noise ratio that is just good enough to keep quantization noise below the hearing threshold. But to reach that we have to make the music as loud as possible, so we get the best possible SNR. We can do that when mastering for a CD, but working this way would be a pain when recording and mixing, where we typically deal with much more dynamic range and several signals at very different levels. Thus for recording and mixing, we usually deal with 24 Bit audio, which gives us an SNR of up to 137 dB for normalized music signals. That’s a quite nice reserve to work with. However we would still have to watch our gain staging as reducing such a signal in level and then turning it up at a later stage would raise the quantization noise level.

### Enter: Floating Point Quantization

This is the number format that is mostly used today for actual processing. It’s the only practical format on desktop computers and also the format used in some modern DSPs. In most practical applications, 32 Bit float numbers are used. You can think of these numbers as a 24 Bit value (the mantissa) with an *integrated gain trim that can operate in steps of 6.02 dB* (the exponent). The arithmetic circuits inside a processor always make sure that all the bits in the mantissa are used up by adjusting the exponent accordingly. The result is that the signal resolution is always kept the same, which means that the quantization noise level is always relative to the signal level. There is thus no need to maximize the signal level to get the best possible SNR.

Still, the SNR of a 32 Bit float signal is the same as for a 24 Bit fixed point signal. But the dynamic range is absolutely ridiculous, more than 1500 dB! That alleviates the need for obsessive gain staging and reduces a lot of dynamic range problems. You can put several volume knobs in series, moving the signal up and down as you like, and the SNR always stays the same. By the way, this is the only case I can think of right now where dynamic range and SNR are a totally different ballpark. Usually, SNR is determined by some constant noise level and the maximum level before distortion, as is the case with fixed point audio and all analog equipment.

To wrap up the essence: in modern floating point DAWs we have to worry much less about things like gain staging and quantization errors. It is much harder to mess the SNR up than it was with fixed point systems, and we can often get away in much more situations without having to increase bit depth. But on the other hand, this fact can also be a source of too much sloppyness, for both developers and users. It creates the illusion that proper gain staging is a thing of the past, which is not true in many cases. I observe that **too many software tools today are very ignorant towards healthy and reasonable operating level norms**. For example many modern software synthesizer are **far too loud!**

If you’d like to learn a bit more about quantization, you should have a look at the Wikipedia article about it. You can also look at how the SNR formula for sines is derived in this white paper from Analog Devices.

What’s your gain staging strategy? Do you even care? Leave a reply in the comments!