Declaring an end to the loudness wars

Imagine a scenario where you're outdoors, in front of your home and can hear music coming from what sounds like several blocks away.  You know right away that you're listening to a band playing live.  Now imagine your neighbor plays piano or saxophone.  What is it that makes it so easy to tell whether they're playing their instrument or listening to a favorite record on their stereo system?  Your neighbor may have the best stereo system you've ever heard but when they're listening and their windows are open, you know it is the stereo and not live music.  What is it about live music that instantly informs us it is live, even from a distance?

Of course the gap between a live performance and one reproduced by even the best playback systems is still a wide one.  However, we have playback systems today that can approach this ideal to an extent unforeseen by those who made some of the great recordings of decades past.  The best modern systems are capable of revealing recorded detail that was unavailable when some of the great recordings were made.  When playing these records, a good system will display many previously unheard cues captured by the engineers and within these cues are many hints of "Life".

This is especially true for those records where a deliberate attempt was made by the producers and engineers to record a document of a performance.  (A recorded document seeks to maintain instrumental balances, timbres, acoustic ambience, spatial and dynamic characteristics as a listener present at the recording event would have heard them.)  This is also true for those records where there may have been no actual performance to document, where the studio itself was used as an instrument to create the final sound.  Many jazz and pop recordings sound more alive, more real today than they did on the best systems of the past.

Particularly in recordings of popular music, as the role of the engineer has changed from documentarian to co-creator of the final performance, the listening audience has slowly become accustomed to recorded sound that is an end in itself.  The recording is the performance.  The result is a kind of recorded magic that could occur no other way and there is a vast legacy of examples, from instruments being played backwards to sounds moving around the stage at super speeds to cymbals that whoosh instead of crash, etc.

But somewhere along the line, in many cases something got lost that didn't have to get lost.  All of the creative processes allowed by the studio have very often supplanted much of the Life rather than adding something new to it.  Today, many records sound like they are being heard over the radio, even when auditioned on the finest audio systems.  There is a canned quality, an absence of the very Life that makes being in the presence of musicians so exciting.  There can be no argument when this is an effect desired and pursued by the creative team.  It only becomes an issue when many start to believe it "has to be" this way.  This occurs when musicians new to record making and novice recordists just learning their craft are not exposed to alternatives.

This brings us back to the question posed at the beginning of this article:  What is it about live music that instantly informs us it is live, even from a distance?  The answer can largely be summed up in two words:  crest factor.  Crest factor is the term used to describe the difference in volume between the average level of the sound and the level of the peaks within the sound.  Sound levels are measured in decibels or "dB".  In a live musical performance, the peaks can often be 20 decibels louder than the average level of the music, sometimes more.  To get an idea of the magnitude of this difference, imagine having a normal conversation with someone a few feet away from you and during the conversation, you clap your hands sharply together.  The peaks created by the hand claps can be 20 or more decibels above the average level of your conversation.

It is the peaks that contain much of the "jump", the "presence" or "attack" of instrumental sounds.  Peaks are also used by composers to emphasize certain sections of a musical composition.  Without the peaks, Haydn's "Surprise Symphony" wouldn't be much of a surprise.  Without the peaks, Elvin Jones' drums wouldn't carry the same emotional impact.


What's Loud Got To Do With It?

The world of recorded music is currently in the midst of so-called "loudness wars".  While this is primarily occurring in the pop music world, its effects have crossed over into other types of music as well.

In an earlier article called What is mastering? I mentioned that many mastering engineers today "compete" on the basis of how loud they make records.  A good number of record producers and A&R folks still seem to think we buy records because they are loud and not because we like the music.  (Go figure.)

The history of the loudness wars can be traced back to the 1970s when vinyl mastering engineers started elevating the levels at the start of each side.  This added to the initial impact of the sound as the record started to play.  With vinyl, the amount of playback time available on one side of a record is directly related to how loud the record is cut.  The louder the signal, the shorter the side.  Since cutting the entire side at the elevated level would result in the available space running out before the music ended, the levels were cheated back down to "normal" after the first 30 seconds or so had elapsed.

The advent of the Compact Disc meant recording time was no longer related to recorded levels, so engineers could turn it up and leave it that way for the duration of the disc.  Digital however brought its own limits to how loud the signal could be.  Unlike analog tape and disks, which reached their overload (and hence distortion) point gradually as the level increased, digital has a maximum that can't be exceeded without resulting in gross distortion.

Audio signals converted to digital are stored as ones and zeros.  The lowest level that can be represented in binary form (the "code" of digital storage) would be all zeros.  In the 16 bit CD format, this would be 0000000000000000 (16 zeros) and would signify complete silence.  A sound at an intermediate level would be represented by a digital "word" consisting of some combination of ones and zeros, depending on exactly how loud that sound is.  An example of such a digital word might be 0100100110110111.  The highest level would be 0111111111111111 (a zero followed by 15 ones).  This represents "full scale", also called 0dBFS (zero decibels, full scale).  That's it.  There aren't any twos in binary code so this is the loudest you can go.  (For technical reasons which are beyond the scope of this article, the loudest value is not a series of 16 ones.  Those who wish to delve further into this should look up "twos complement" as it relates to CD encoding.)

If we were to take the conversation and hand claps we talked about earlier and wanted to record them digitally without suffering any distortion, the hand claps (i.e. the peaks) might end up at zero on the digital meter (0dBFS) and the conversation, being say 20 decibels lower in level than the peaks, might end up at -20 on the digital meter.  Our average level, the conversation, would be -20, with our peaks, the hand claps, at 0.

How does one remain "competitive" and make louder records once their recording is already hitting the digital ceiling?  Our competitive engineer might want to make their record average at a level higher than -20.  Let's say they wanted to raise the level by 3dB, a very easily audible loudness increase.  If they merely raised the overall level by 3dB so the signal now averaged at -17, the peaks which are 20dB louder would now be 3dB over the 0dB distortion free maximum.  Since there is no way to digitally represent a signal that exceeds 0dB, the peaks would be "clipped", meaning the natural shape of the sound wave's peak would be cut off at the top and instead of looking like a mountain, would look more like one of the flat topped mesas in southern Utah.  Clipping in a music signal sounds quite harsh and "distorted", so our engineer has to find another way.  This is where the compression comes in.

To achieve their goal of raising the level on the disk by 3dB, the engineer in our example needs to compress the dynamic swings of the signal so the peaks aren't more than 17dB louder than the average.  That way, they'll end up at 0 on the digital meter and there won't be any clipping.  By using the tools available for dynamic compression, our engineer has made a CD that is 3dB louder than they could make if they left the original dynamics intact.  But loudness wars being what they are, soon everyone was compressing their signal by 3dB so they could raise the average loudness by that much.  To keep that competitive edge, our engineer might compress the peaks on their next project by 6dB, a fairly huge increase in level.  Now the peaks are only 14dB louder than the average signal.  Our original average level of -20 can be raised to -14 with no clipping of the peaks.  And when everyone else starts compressing the peaks by 6dB, how else can our engineer stay in business but to push the levels still higher.

The current "standard" for pop music has average levels closer to -10, maybe higher by the time you read this.  Those original 20dB peaks are reduced to 10dB peaks, one tenth their original level!...or less.  We're starting to see some records with dynamic ranges on the order of 6dB!

Remember, when the Compact Disc was introduced, one of the promises was the potential for 96dB of dynamic range.  (The newer, high resolution 24 bit formats have a potential for 144dB of dynamic range.)  From the softest perceivable sound up to the threshold of pain, human hearing can encompass a dynamic range of 130dB.  We're starting to see some records with dynamic ranges on the order of 6dB!

What do these records sound like?  Well, they're loud.  Everyone notices that and most folks will reach for their volume control to turn them down.  These records also have a "stressed" quality about them that makes long term listening a fatiguing, if not a painful proposition.  To make matters worse, some engineers are taking it a step further and allowing some clipping on the final result, just to squeeze (and "squeeze" is exactly the right word here) a bit more loudness out of the record.


(Loudness) War Is Over (If You Want It)

The roots of the loudness wars most likely took hold when someone realized that a very small increase in level is perceived by most listeners as sounding "better".  And if a little is good, the thinking must have been, more will be better still.  Raising the recorded level misses the benefits of achieving the same loudness increase by turning up the volume control in playback.  (More on this in a moment when we talk about volume controls.)

Many of the folks who make loudness their goal hear the compressed version and find it "better" than the uncompressed version.  If they were to take the compressed version and carefully adjust the playback level to precisely match that of the uncompressed version, they might find that what impressed them was the increase in volume (i.e. quantity) but not necessarily an increase in quality.  In fact, the quality in a compressed recording tends to move in the opposite direction.

To paraphrase one of my musical heros, the loudness war is over (if you want it).  And there are some good sonic and musical reasons to end it now.

From a sonic standpoint, we can start with the volume control and its effect on how a playback system sounds.  Electronically, a volume control is a type of resistor, placed in the signal path to allow us to (you guessed it) control the playback volume.  If we were to bypass the volume control, would the sound disappear?  Think of a water pipe that ends in a faucet you can use to control the amount or volume of water the pipe delivers to your kitchen sink.  If you were to remove the faucet, the flow of water, far from stopping, would come out of the pipe at full force.  Similarly, without a volume control, the playback level of your system would be full up and endanger your hearing as well as your loudspeakers.  Volume controls, like water faucets are used to turn things down, not up.  This means in effect that there is more of the volume control in the circuit when the volume is turned down than there is when it is turned up.  Louder records make you adjust your volume to a lower setting than not so loud records to achieve the same in room loudness.  Said another way, when the recorded level is not pushed, you turn up your volume control a bit more than you would for a typical compressed recording.

Anyone that has auditioned different volume controls will know these are devices that can have profound effects on the sound quality a system can deliver.  This means if we make two identical recordings that differ only in level, the lower one will result in better playback quality because by turning up the playback volume there will be less of the volume control in the circuit.  To play the louder recording at the same apparent level, we'd have to turn the volume control down, putting more of it in the signal path.  While this might be insignificant at differences of just a few decibels, when the differences approach 10dB or more, there are audible consequences.

There are other sonic reasons to not push recorded levels, chief among these is the fact than most circuits both analog and digital (the latter, contrary to popular wisdom) provide audibly better performance when the levels are not pushed to the top.

Then there are the many musical reasons to end the loudness wars now.  Referring again to What is mastering? I mentioned that in my experience all the best sounding recordings I have heard have in common the fact that they are not loud.  Having loudness as a goal necessitates the sacrifice of dynamics.

While I can understand the use of compression as an effect on individual tracks of a multitrack (e.g. to get Ringo's cymbal sound), I don't at all like its use on whole mixes where it is generally used to achieve more loudness (some say "punch" but how do you increase punch by taking away dynamics, where the punch lives?).

Some say compression helps make the quieter parts of a record heard more easily over road noise when listening in the car.  Or that it makes for easier late night listening without disturbing the neighbors.  Music lovers have to ask why their records should be tailored for the lowest common denominator listening situations.  Why not have a "compress" button on the player, either in the car or at home and leave the record itself whole for those occasions when we want all of the music?

The loudness wars leave music as a casualty.  When we avoid compression used for the sake of loudness we gain innumerable musical rewards:  The dynamics of individual instruments help provide the rhythmic propulsion of the music, whether it is a violin concerto or a reggae beat.  The sense of relaxation that ensues is in high contrast to the stress response engendered by heavy compression, allowing for much deeper involvement in the music, greater ease in hearing all the musical parts and longer listening sessions.  The sense of recorded space, the acoustic of the recording venue, whether natural or studio generated, is much more in evidence, helping to bring the listener closer to the musical event.  Bass instruments maintain more of their pitch definition as well, not suffering the softening and defocusing they do when loudness takes precedence.  Electric guitars still have the "bite" they do when you're in their presence but which never makes it onto most records in quite the same way.  Horn sections in both jazz and symphonic music get to keep the amazing power they have in real life.  Instruments that play in the higher registers keep the harmonic sweetness they have in real life, without the hardening that accompanies compression.  The very air around the musicians is preserved and the breath of Life makes it all the way to the record.

There will always be those who actually like the sound that results from making loudness a priority in their record.  All well and good if this is their goal.  It should be understood however, that there is merit in making records that sound like real music as well.

Maybe it's time to get out those old "Play It Loud" labels that used to be on some of the older recordings from the days before the loudness wars.  Lip service will not end the loudness wars, action will.  It will take boldness on the part of musicians, producers and engineers.  It will take those willing to lead the trend instead of following it.  Some will take the time to actually make carefully level matched comparisons of their projects mastered for loudness vs. mastered for music.  Some will have the music and not the level meters determine the final recorded level.  They'll start to find their records actually "jump out" more when played on the radio, through the broadcasters' compressors, than the records that are pushed past the top.  They'll find themselves having to turn up their volume controls compared to the setting for the pushed records.  They'll also find their projects showing more punch, more kick, more space, more bass, more air, more ease, more music, more Life than those other records.

The end of the loudness wars will mark the dawn of the next golden age for music recording.