One Word To Confuse Them All

In the world of audio, nothing seems to trigger more reflexes of fear than “phase issues”. The reason is probably that it describes a rather scientific concept which is sometimes hard to relate to the reality of music signals. Also, there are three very different phenomena that are often confused by using the same word for all of them.

Talking about phase sometimes seems to be like talking about quantum mechanics. It’s hard to understand what it actually is, but there is some popular semi-knowledge about it and if you can skillfully juggle some keywords, you’ll sound like the Sheldon Cooper of sound. At your next 19-inch cocktail party, try out a phrase like “I don’t like what this EQ does to the phase”. It almost never fails to impress.

But enough of the introductory joking, let’s dive in and see what the term actually means and what kinds of “phase issues” exist. Although that’ll only take us halfway to enlightenment. Today, we need to pave the way for next week, where we’ll finally get to how phase issues actually sound.

Let’s start by getting over with the necessary engineering vocabulary and afterwards getting to a more practical view.

The Boring Engineering Approach

I want to keep the engineering definition very short, because it usually confuses more than it does any good when related to sound. However, we might need that to understand the more intuitive ways to look at it, which I will propose later on.

In the world of signal and system theory, we usually look at signals as a sum of sinusoid components. This is very handy to relate time and frequency domains. However, there are a couple of mathematical peculiarities involved. Especially the fact that one sine component ranges on a time scale from infinitely long in the past until infinitely long in the future. Understanding what that means to practical signals in reality is one of the bigger hurdles an electronic engineer has to take. Failing to do so is the seed for confusion and bullshittery.

The sine waves that a signal is composed of differ in frequency, magnitude and phase. Frequency and amplitude should be self-explaining, and I assume you know that the phase part determines a time shift of the sine in relation to its period. If not, ave a look at this hypnotic picture, where the phase is shifted by different amounts:

SinePhase

In practice, this definition doesn’t matter very much. The truth is: this way of looking at signals and systems makes the math handy if you want to calculate or measure how a system behaves. But in the context of actual music it’s very very useless.

A More Intuitive Approach

In reality, we don’t deal with infinitely long sine waves, but with totally arbitrary signals that constantly change. I think a good approach is to look at it from the way we actually perceive sound, which is by analyzing how much power is present in certain frequency bands repeatedly for consecutive short time frames. The amount of energy present at each frequency is determined by the magnitudes of the sine components. The phase on the other hand, determines how this energy is distributed over time. If you’re totally new to this, this distinction might need another couple of posts to really sink in. Read it again aloud: Phase determines how energy is distributed over time.

Additionally, if we not only talk about signals, but systems that process these signals (like an equalizer or any other device), we can also describe much of their behavior by looking at the magnitude response – which alters the energy at different frequencies – and the phase response – which again alters the distribution of energy over time (I really like that phrase). In the following, we’ll mostly talk about the phase response of systems, not the absolute phase information in a signal.

Three Types of Phase Effects

There are in general three different ways that a device or process can alter the phase information of a signal. Let’s have a look at what they are.

Constant Phase Response

In practice, we usually only encounter a constant phase shift of 180 degrees (or 0 degrees, which means nothing happens at all). The translation to a normal person’s words would be an inversion of the signal or a reversed polarity. This is what the phase switch in many mixing desks or preamps does.

In signal theory, we also sometimes deal with a constant 90 degrees phase response, which is known as a Hilbert transform. You don’t need to know that, but remember the word for the 19-inch cocktail party. Smartypants can be very sexy.

Linear Phase Response

I sometimes wonder why this term is used so much. In fact, a linear phase response is nothing more than a delay. You usually come across that in digital equalizers, where a linear phase equalizer is one that doesn’t alter the phase relationships between different frequency ranges. Although that sounds like a good idea, there are very few cases where I would really recommend this. Anyway, this is worth dedicating a whole article of its own to. Sorry to disappoint you here, but for the moment, have a look at this excellent video from Fabfilter explaining the issue (and showing some interesting advanced tricks).

Non-linear Phase Response

Ah, finally something makes more sense. This is in fact the only thing that the term phase response should be used for, in my opinion. As we just saw, there are much more descriptive words for the other two, like polarity and delay.

Now what does it mean if you have a non-linear phase response? Well, that’s probably hard to grasp from a phase point of view. A much better way of looking at it is via the term “group delay” (which is the derivative of the phase response, 19-inch cocktail party alert!). Although this is probably not strictly exact in a theoretical sense, it is fair to say that the group delay tells us how much different frequency ranges are delayed. Yes, that’s the key takeaway here: for any filter with non-linear phase response, different frequency ranges reach the output with different delays.

We know from school that the derivative of a straight line is a constant. So stepping back, we see that a linear phase response results in a constant group delay. All frequency components are delayed by the same time.

But all of this stays just some boring engineering blah as long as we don’t know how this affects what we really hear. The thing is, we don’t have a direct perception of phase as such. Thus, it usually takes a bit more to actually make phase information audible. The fun continues next week with a compilation of all the fun ways to mess around with phase, also including some not-so-fun and even some totally-superficial ways.

What’s your favorite alternative vocabulary for phase issues? Do you even have any?

Jon Boley - March 2, 2016

It’s difficult for me to wrap my mind around group delay. If the slope of the phase in a narrow band is positive (e.g., near the edge of a very sharp filter), the group delay is actually negative! So interpreting group delay as time delay can be misleading since we aren’t actually delaying into the past. (Time travel is confusing.) The approximation of group delay as a time delay is really only valid when the phase is approximately linear, and really only applies to the envelope of the signal (however you choose to define that).

I prefer to think in terms of phase delay, which is simply the time shift at any particular frequency. (And I have come to terms with the fact that group delay is a mathematical term that has limited use on the real world.)

Of course, for any real signal, you won’t have a pure tone in quiet, so what you see will be due to a mixture (both constructive & destructive) of different phase delays.
And of course, as you mentioned, Fourier analysis also adds it’s own complexities…

Christian Luther - March 2, 2016

Yes, it’s difficult. 😉

The problem is that it’s not really possible to look at isolated portions of a signal and think of them as delayed more or less. But I think it’s fair to do this simplification when we talk about hearing a monaural signal, as what we actually decode as information is a set of slowly changing envelopes around different frequencies (slowly compared to the “center” frequencies of the auditory filter bank).

But that’s not all. In practice, the envelope is not only delayed, but also stretched in time. If you look at consecutive FFT frames from a very long allpass response, you’ll see (and hear) an initial frequency dip at the allpass center frequency in the first frame, followed by a decaying peak at the same frequency after the rest of the impulse is gone. It’s best demonstrated with the Schroeder allpass comb filter used in many reverb algorithms. I plan to demonstrate that next week.

As for negative group delay: yep, that puzzles me, too. I have a rough idea about how that might work though. I think it would make the envelope shorter, shifting it’s “center of gravity” a bit back in time. Nevertheless the start or the cause of the envelope wouldn’t time travel. Of course that would only work with slow enough envelopes (= narrow spectrum within the range of negative group delay).