Does Convolution Reverb Work? Part 2
We continue the skeptic’s guide to convolution reverb by looking at impulse response measurement methods and how results are affected if reverb is not linear and time-invariant.
So last week we already thought about if real-world reverberation actually satisfies the requirements for proper reproduction using convolution. We found several reasons why it doesn’t necessarily do so. The larger and more complex an acoustic space, the more likely small nonlinear and time-variant effects become dominant in later parts of the reverb tail.
But before we get to what this means for the applicability of convolution reverb in practice, let’s have a short look at another common way to generate reverberation.
Artificial reverberation is one of the blackest arts in digital audio signal processing. Since the earliest days of the field, armies of researchers and engineers have engaged in the quest for the perfect digital reverb algorithm. The reason is obvious: reverberation in real-world rooms, halls and other spaces is extremely complex and chaotic. After all, we’re talking about three-dimensional reflections of sound scattered in all directions by objects of all shapes and acoustic properties. And all these reflections and sound paths feed each other again and again, but always a little differently.
There’s a boatload of different digital reverb algorithms to study. But they all share one common concept: connecting several delays of different lengths together in various ways. If done right, this produces a long (or maybe even endless) series of echoes. But a large amount of different delay times is needed for these echoes to be as chaotic as possible. The human ear is great at detecting even the slightest amount of regularity in these echoes, and perceives them as tones. By the way: the choice of delay lengths in such a reverb structure is often much more crucial to the overall quality than the choice of structure.
Due to CPU and memory constraints however, it is usually desirable to cut the number of different delays to a rather non-ridiculous amount. But this strongly limits the amount of chaos we can produce.
Thus, since the earliest days of digital reverb (EMT 250 anyone?) the chaos has been enhanced by slowly changing these delay times over time. For most reverb algorithms, and especially for the classics, this modulation is essential.
You won’t find many time-invariant reverb algorithms out there, but that’s actually a good thing. And as we learned, it’s something that happens in real spaces as well (although not as strongly).
So almost no typical source of reverberation can be clearly considered linear and time-invariant. But before we get to what that means perceptually, let’s have a look at how impulse responses are measured.
Impulse Response Measurement
The simplest method is – obviously – to generate an impulse in the acoustic space and just record the result. With digital reverbs creating an impulse is trivial. But in rooms, halls, cathedrals?
In the early days of impulse response measurement, impulses were created using starter pistols, electrical discharges or similar tools. Without hesitation we can call the sounds these tools make an “impulse”.
But of course, all these impulses have a sound on their own. As a result, the “sound” of these impulses will become part of the measured impulse response. Even when compensating this coloration with an EQ it’s still quite inaccurate.
For a more accurate measurement, the exact excitation signal must be known to remove its influence from the recording (a process called deconvolution). With a loudspeaker it would be possible to play back any excitation signal that we have previously created or recorded. A couple of computations later, we have a perfect measurement. Or have we?
Acoustic spaces and artificial reverbs are never perfectly quiet. There is always some noise that will be recorded along with the measurement. This noise will inevitably contaminate the result. The problem with impulses as excitation signals is that they contain a rather small amount of energy due to their extremely short duration. On the other hand, we have to record the result for several seconds depending on reverberation time.
Especially with long reverb tails, the measurement captures a large amount of noise along with a limited amount of signal energy. Consequently, the signal to noise ratio gets really bad.
Trying to reduce the noise in the space and measurement systems will only get you so far. Instead, the best gains in SNR can be achieved by increasing the signal energy introduced into the system.
However, just making excitation signals louder is not a good option. For practical reasons, people should still be allowed to be in the space during measurement. Also, the structural integrity of the building should be preserved. And playback and recording equipment has a limited capability of producing and recording high sound pressure levels anyway.
Instead of using ridiculous levels, a far better method is to spread out the signal energy over time, thus increasing the duration of excitation signals. As I said above, if you know the excitation signal, you can use anything that has sufficient content across the audible frequency range.
Typical signals to use are sine sweeps, white or pink noise, or special noise signals called maximum-length-sequences. In theory and with enough patience, you can achieve any SNR you want if you just measure long enough. Equivalently, you can repeat the same measurement over and over and average the results. At least, if noise is your only problem.
The above relates only to real noise that is the result of a random process. That essentially means that no two snippets of noise are the same. Only in this case can longer or multiple measurements average this disturbance away.
The case is different for any noises that aren’t random. Examples for such unwanted disturbances are mains hum, interference from switching power supplies and noises from air conditioning or similar sources. These won’t go away no matter how long the measurement or how often you do it. The result are resonating components in the reverb tail.
Another problem that arises from long measurements is their sensitivity to short-term noises like footsteps, coughing or the like. It depends on the kind of measurement signal how these transient events affect the outcome. A transient in a sweep measurement for example triggers a sweep-like artifact after deconvolution. A starter pistol recording will just contain additional echoes.
To decrease the effect of such random short-time disturbances it’s possible to rather use many shorter measurements rather than one long measurement. As long as nobody manages to clap his hands at the exact same time during every single measurement, these artifacts will be reduced by averaging.
What About The Time-Variance?
After this little detour into impulse response measurement, we can have a look again at real reverbs. Consider first the starter pistol methods. The recorded reverb can be directly used as an impulse response and loaded into a convolution reverb. If you play a short percussive sound through it, it will still sound very much like in the real room. All the chaotic reflections are part of the recorded response. The same even goes for a modulating digital reverb excited with an impulse.
The picture changes if you send steady tones like a sine through it. With an LTI convolution reverb, after some time (the length of the impulse response), the sine tone will reach a steady state and you won’t hear any reverberation anymore (I invite you to try it!).
This doesn’t happen with time-variant reverb. The slight changes of the impulse response over time lead to a decorrelation of reverb and input signal. This is actually a very pleasant property, as it detaches the reverb from the signal and can make it better audible.
With modulated digital reverbs, there’s an additional audible effect. Changing delay times always leads to detuning through the doppler effect. If you listen closely, you’ll hear this detuning in some digital reverbs when exciting them with steady tones. This can add a very pleasant and chorus-like quality to the reverb. This behavior is impossible to recreate with convolution reverb.
When using sweeps or noise sequences for impulse response measurement, it gets even worse. Not only do we lose the modulation, the frequency content of the reverb tail is also altered. Why is that?
A simple way to think about it is the aspect of decorrelation. This means that there isn’t a clear linear and time-invariant relationship between input signal and response. From a signal processing perspective, everything that we can’t trace back to the input signal is actually similar to noise. Sweep measurement and the like reduce noise by averaging, and they do so with decorrelated components of the response as well. And as the decorrelation is strongest at high frequencies, the measured impulse response will have less high-frequency content and sound more dull than the original.
Wow, that was a tour de force again I guess. We scratched several surfaces here again, and I could go on writing (and learning) about reverb for ages. As always with these kinds of topic: don’t panic! Convolution reverb is still an extremely convincing means to add realistic reverberation to recordings. In fact, it’s the best available way to recreate real acoustic spaces.
But now you know the possible caveats, like a possible lack of high frequency content. Luckily, nobody issued a law yet that forbids cheating those high frequencies back in using an EQ.
What you definitely won’t get with convolution is an accurate replica of expensive, rare and possibly vintage digital reverb boxes, as most of them use modulation. Apart from the high frequency loss, the decorrelated and chorus-like qualities of these reverbs will definitely get lost. And that’s what actually makes these interesting.
What’s your favorite reverb and why? Share your experience in the comments!