Have you ever heard people complaining about the diminished quality of streamed music as opposed to, say, CD, and wondered what they were talking about? Surely a piece of music is a piece of music and that’s that, right? Well, it’s actually a much more complicated matter than that.
When it comes down to it, most of us don’t really know all that much about the sound file that we’re listening to, but that ends here. After reading this ultimate guide to audio bitrates and formats, you’ll have a comprehensive understanding of how audio works in the modern world.
What’s more, your ears and mind will be more in tune with the true nature of your favorite music. Sound good? Awesome; let’s get started!
What the Heck is a Bitrate Anyway
Put simply, the bitrate of a recorded sound refers to the amount of data that’s held in the audio file it’s playing from. Every single audio file has a bitrate, and every second of these recorded sounds is made up of kilobits.
For example, if you’re listening to, say, Let’s Go Crazy, by Prince, and it had a bitrate of 140kbps, that would mean that each second of the song was made up of 140 kilobits of data. With a duration of 4 minutes and 40 seconds, the total volume of Let’s Go Crazy would be 39,200 kilobits.
What does the number of kilobits per second in a song file mean in real-world terms? Well, it’s actually really simple. The more kilobits a second a song file contains, the better the quality of the audio output.
The important thing to remember is that kilobits are units of audio data. Each bit of data is essentially a very small part of the song, an intricate detail.
So, the more details a piece of audio is made up of, the more robust and nuanced the sound will be. The bass frequencies will be rich and full-bodied, the mids nice and punchy, and the treble crisp and clear.
I find it’s easier to understand bitrates by thinking of them as pixels on a computer screen. The more pixels you have, the more detailed the picture is.
If you remove pixels, you end up with a Super Mario World sort of image. Take away even more, and we’re back to Frogger and Space Invaders.
Now we’ve covered the basics, let’s discuss how audio bitrates correspond with popular music file mediums.
CDs and Audio Bitrates
All CDs have a bitrate of 1411kbps, which is often expressed as 16/44.1K or simply as 16 bit. The reason CDs all have the same quality audio is because the bitrate was standardized by Phillips and Sony in 1987.
CDs took the world by storm, leaving our wonderfully warbly tape collections gathering dust in dark corners and stuffy attics.
CDs’ worldwide takeover was primarily fueled by the fact they were a more efficient way of storing music, you could skip directly to certain songs, and imperfections didn’t create noisy sonic artifacts.
As we now know, the meteoric rise of the CD would soon come to an end, the reason being… internet-based, streamed audio.
WAV Audio Files – What Are They and What is Their Bitrate?
Developed by Microsoft and IBM in 1991, WAV was the audio file format used to read music on CDs. WAV files were so prevalent that Apple decided to create AIFF files in their image.
WAV and AIFF files have a simple architecture, yet they’re uncompressed, meaning they’re a full-fat file — so to speak. They contain a lot of data and therefore produce high quality audio.
As they’re the format choice of CDs, most WAV files have that same 1411kbps bitrate at 16 bit; however, there are some variations to speak off. The bitrate of a WAV file is defined by a formula that involves multiplying sample rate with bit depth, and the number of channels, e.g. mono, stereo, etc.
MP3 Audio Files – What Are They and What is Their Bitrate?
The fundamental difference between the MP3 and WAV file formats is compression. Whereas WAV is a large uncompressed file, MP3 files are compressed, which means they’re much smaller.
Being that sound quality reduces in tandem with file size, you wouldn’t think MP3s would have caught on, but at the time, diminished audio quality for smaller file sizes was a trade well worth making.
Released in 1993, MP3s were born into a world dealing with infuriatingly slow internet speeds by today’s metrics. The smaller file size meant they could be shared efficiently.
In regard to the bitrate of MP3s, much like WAV, it’s a variable sum, but it has a ceiling rate of 320kbps at 16 bit. The compression codec of MP3s functions by removing “insignificant” frequencies from the file, without harming the fundamental contents. Hypothetically, MP3s can be compressed until they hit a base rate of 96kbps.
16 Bit vs 24 Bit – What’s All the Fuss About?
You’ll have noticed that “16 bit” has come up a lot so far, and that’s because the file types and mediums I’ve covered have all had an audio resolution of 16 bits, but you may also have heard of 24 bit music.
The audio resolution rating expresses how many discrete values the audio file can contain when quantizing an analog signal, and it’s a more the merrier situation, which is why 24 bit audio is often thought of as superior.
For example, 16 bit audio can represent 16,536 discreet quantization values, whereas 24 bit audio can represent 16,777,216 discrete quantization values. This means 24 bit audio contains more information about the original sound.
Another important thing to understand about this bit-war is that the quantization process adds extraneous noise to the audio signal, aptly known as “quantization noise”.
You’d be forgiven for thinking that the more points of quantization there are in a signal, the louder and more present the build up of unwanted noise will be, but it’s actually the opposite
The more quantization values recorded audio is made up from, the better the signal-to-noise ratio is.
This is why old timey radio recordings sound so staticy and indefinite as if they’re speaking through a polystyrene or tinfoil mask. If you were to record yourself speaking in, say, 4 bit, you’d sound like that too.
So, why aren’t all modern audio files 24 bit, right? Well, it makes for much larger file sizes for one thing, and how much better 24 bit audio actually is to the human ear is a hotly debated topic amongst audiophiles and studio engineers.
Most people listen to compressed music files these days, which reduces the detail of a song anyway, so some many take on a “what’s the point” attitude to 24 bit audio.
Now let’s discuss what makes up the bit rate: sample rate and bit depth.
The sample rate of audio essentially refers to how many individual points of data the file contains. The more data there is, the more detailed the sound.
You can think of it like the frames per second of a video camera.
If you have one that records 10 frames a second compared to one that films 24 frames per second, the footage taken by the second one is going to be much smoother, and that’s because there are fewer gaps in the recorded data. This is precisely the same concept when it comes to sample rate.
To understand how sample rates work, we need to talk about the Nyquist–Shannon sampling theorem.
The theory posits that by doubling the max frequency of a source before compression, you’ll get the most accurate representation of it when it’s then decompressed.
Being that the human ear tops out at roughly 20kHz depending on the person, if you double it during the recording process, very few or no audible frequencies will be lost when that sound is unpacked at a later date in a different location.
This is why CDs were set to 44.1kHz. It’s twice what we can hear, plus a little extra just in case.
However, people still push the boundaries in terms of sample rates when recording and producing music, often doubling the CD standard. Hi-resolution audio can even be recorded using sample rates as high as 192kHz.
The question is…is this really necessary? By operating so far beyond the realms of human sensory perception, isn’t it pointless?
We can use our pixel analogy from earlier to think about 24 bit audio. Computer screen resolutions are constantly being improved, filled with larger amounts of smaller pixels leading to greater image acuity, but at a certain point, the human eye will stop noticing the upgrades.
That said, there are certain instances when utilizing a high sample rate makes a lot of sense, mostly when recording audio in a studio. Take an analog to digital conversion, for example.
This could be you saying “bitrates are cool” into a microphone. Your voice is the analog signal, and the microphone transposes it into an electrical signal which is digitized on a computer.
This analog to digital conversion has a low pass filter baked in. Its job is to eliminate all frequencies beyond the Nyquist frequency (half the sample rate).
So, if we’re recording you saying “bitrates are cool” using the CD standard 16 bit 44.1kHz, any frequencies beneath half this sum will be detailed in the audio file. Anything beyond half will be weeded out.
Now let’s imagine we’re taking a second recording of your dulcet tones exclaiming that “bitrates are cool”, but we set the sample rate to 192kHz, the low pass filter shifts its attention further away from our hearing range.
As it’s working at the halfway point, which would be 96kHz, it leaves absolutely everything in our range untouched, which many claim leads to greater audio clarity. Technically, they’re correct, but most humans won’t be able to pick up on it.
However, you don’t have to use a ridiculously high sample rate to achieve 24 bit audio quality. One of the most popular ways of recording audio is using a 48kHz 24 bit process. This provides a slightly larger buffer zone than the CD standard while also providing the enhanced audio accuracy of 24 bit.
Bit depth is the other contributing factor to the overall bitrate of a piece of audio. Remember when I mentioned that sample rate is like the frames per second of a video camera? Well, the bit depth of a piece of audio is analogous to the resolution of a camera.
Bit depth is a measure of how accurate a song file is to the original source. If we think about a computer screen again (last time, I promise), if you’re watching a movie in 1080p, it’s not going to be as good as watching the same movie in 4K, as 4K has far more pixels.
What Bitrate Should You Use?
If you’re trying to create an extremely hi-fidelity representation of the original sound source, bigger is better. Choose a high sample rate and supersize the bit depth. You’ll need to squeeze in as many kilobits per second into the file as possible.
There are; however, downsides to stuffing all this data into a single file. To determine which bitrate you settle on, you’ll need to assess how much storage you have spare.
If your storage real estate is somewhat dwindling, you may want to keep the bit rate at a basic MP3 level, which would be 128kbps, amounting to roughly 1mb per minute of the track.
Should you have a little more wiggle room within your storage, you may consider choosing a high quality MP3 rate of 320kbps, which equates to about 2.4 mb per minute.
But if storage space is no object, why not aim for something like CD standards which will cost you something to the tune of 10.6 mb per minute.
You should also consider where the track is going to eventually end up when choosing a bitrate. Most streaming services need to compress files in order to ensure they play seamlessly with minimal latency.
As it must stream video as well as audio, YouTube is especially unkind to music. To ensure fluidity and kill off those pesky buffering dots, YouTube streams AAC audio files at a 128kbps rate.
A good rule of thumb is to try and make an audio file as detailed as possible because even though streaming platforms may compress it to smithereens, it will sound better than one that was overly compressed to begin with.
Even if you yourself need to reduce file size after you’ve created it, you can do so. Just remember that once you shrink your audio file down, in most cases, you cannot revert it to its full-fat size.
The lower the bitrate of an audio file is, the smaller the file ends up being, but the less sonic detail it will contain when listened to, no matter how good a sound system is. But how do the popular audio file formats compare directly on this size to quality ratio?
- MP3 (Basic) – 128kbps – 1 mb per minute
- MP3 (Moderate) – 256kbps – 2 mb per minute
- MP3 (Advanced) – 320kbps – 2.5 mb per minute
- WAV (16 Bit) – 44.1kHz – 10.5 mb per minute
- FLAC (24 Bit) – 96kHz – 34.5 mb per minute
You may be wondering what exactly is stripped from these smaller file sizes, and you can assess this by checking out the audio cutoff of a file. For instance, a 320kbps 20kHz MP3 file would lose details from just below 20kHz, but a 128kbps 15kHz MP3 track loses sonic detail from roughly 15kHz and above.
Using Your Ears
The only real test of an audio file that matters is simply listening to it, but you’ll need to use some quality studio monitors or headphones to really hear the subtle elements of a song’s composition.
I did a ton of research when I was shopping around for studio monitors and I found that the Yamaha HS7 was the best value for money choice on the market, so I bought a pair, and they’ve been fantastic up to now.
Hi-Resolution Audio Explained
You’ll have heard this term being thrown around mainly in audiophile communities, but as recording technology advances, it’s sure to find common usage among general listeners too. Hell, even streaming services like Spotify have already adopted certain facets of hi-res audio.
So, what exactly is it? Is it just a buzzword, or is it the future of audio files?
In a nutshell, hi-resolution music is any audio file that exceeds the standards set by CDs. These files are considered to be the ultimate in hi-fidelity recordings, staying as true to the original sounds captured in the studio as possible.
Despite hi-resolution audio as a term only really breaking ground over the last few years, we’ve actually had the technology to record in hi-res since 1995. The problem was that the rest of the market hadn’t quite caught up.
We still had slow internet speeds and poor image quality on our computers. As the rest of the market has caught up over the last decade, there is now demand for audio to match.
Also known as lossless or HD audio files, hi-res song files have a much larger sample rate and bit depth, resulting in crisp, quality music with minimal fake samples within audible range, but can people tell the difference?
Studies show that there are very few people that can separate a CD standard audio file from an HD audio file in blind tests. For this reason, I wouldn’t worry too much about always using hi-res files, but, on the other hand, it’s comforting to know you’re hearing exactly what the composers of the song wanted you to hear.
Audio Formats – The Definitive List
As many audio file formats as there are, they can all be categorized into two comprehensive categories: compressed and uncompressed.
Uncompressed Audio Files
Uncompressed audio files have all the trimmings. Large files practically brimming with sonic details, they take analog signals, digitize them, and that’s that. No further modifications take place. Below I’ve listed the most common uncompressed audio files.
- PCM – PCM stands for Pulse-Code Modulation. These are the standard file used to encode music onto CDs and DVDs
- WAV – If you’re around 25 or above, you’ll be familiar with WAV files. They’re a digital container used by computers to read the PCM files on CDs. The quality of WAV bitrate depends on the PCM source file.
- AIFF – AIFF files are simply Apple’s rendition of the WAV file. They’re used to read PCMs, and much like WAVs, their bitrate is defined by the source file.
Compressed Audio Files
Compressed audio files can be set apart into two subcategories: lossless and lossy. Each has benefits and drawbacks depending on the intended application.
Lossless Audio Files
These file types are all about compressing data without stripping parts.
- FLAC – Free of licensing stipulations, FLAC files (Free Lossless Audio Codec) are thought of as the best way to compress audio files. They take the raw file and squeeze it down to 60% of the original size. FLAC can also accommodate small amounts of metadata such as album covers.
- ALAC – ALAC is Apple’s interpretation of FLAC files. They’re not quite as popular, but being that FLAC files aren’t compatible with iOS systems, they’re a necessary addition to the audio file catalog. You can convert FLAC files into ALAC files and vice versa via free online tools.
- WMA Lossless – Windows Media Audio Lossless files are Microsoft’s foray into the lossless compression game. Designed for archival operations, WMA uses a slightly different compression technique which allows for the file to be decompressed without any data loss or degradation. Now supported by a number of consumer gadgets and gizmos, WMA files are a great way to keep 24 bit audio manageable.
Lossy Audio Files
Lossy files do the same thing as lossless files, but they also strip away some original data in order to compress the file to an even greater degree. Here are the most common examples:
- MP3 – As you know, MP3s revolutionized the music industry and listening habits. The reason they’re so small is that they do away with any data beyond that of human comprehension, and when that’s all gone, they strip some hard to hear audio too. Once the data is as light as possible, MP3 consolidates what’s left into a tiny file.
Trading off some audio quality for a reduced file size made MP3s easy to send across the internet. In fact, you can blame a large majority of audio piracy on MP3’s digital portability.
While still a widely used format, rising internet speeds, and larger storage capacities mean it’s steadily declining in popularity.
- Ogg Vorbis – It may sound like some sort of alien warlord, but Ogg Vorbis is just another open-source file format. It’s a perfectly capable compression tool, but has never found the same traction as competitors. The Ogg Vorbis format is composed of two discrete sections. Vorbis refers to the compression mechanism, while Ogg is the digital container.
- AAC – Capable of delivering enhanced audio quality at the same bitrates, you can consider AAC files (Advanced Audio Coding) the natural successor to MP3’s internet throne. How does it do it? Well, for one, it can store 96kHz sample rate audio as opposed to MP3’s max rate of 48kHz. AAC can also support 48 channels, while MP3 peaks at 2. It’s for these reasons that AAC has been adopted by huge corporations such as Nintendo, YouTube, and Apple.
Honey, I Shrunk the Audio File – How Does Audio Compression Work?
It’s all well and good us yammering on about compressed audio files, but we should also discuss how it actually works.
Picture the last time you packed for holiday and your suitcase wouldn’t zip up. You sit on it, jump on it, squeeze it using hydraulic tools, and eventually, it closes. Compressing lossless audio files follows the same principles, albeit with less swearing involved.
In this analogy, lossy compression is the equivalent of accepting that the suitcase isn’t big enough for the number of things we want to take on holiday.
We then remove a pair of sandals, that obnoxiously loud shirt our partner bought us for our birthday, and a pair of shorts. Now the suitcase closes in a flash, and it’s easier to carry too.
Compression is done to keep file size to a minimum, freeing up storage space and rendering them easy to transport via download, upload, email…you name it. This shrinking technique was especially handy in the early days of the internet, as a CD file could take hours, even days to download.
You may be wondering how a lossy compression mechanism knows what data to omit and what to keep.
The answer is perceptual noise shaping. PNS is essentially an algorithm derived from Psychoacoustics, the study of human hearing.
While some humans can hear frequencies up to around 20kHz, the minimum audition threshold for most of us generally hangs around the 16kHz zone, with our sensitive range falling between 1-5kHz.
In fact, even those of us that can hear exceptionally high and low frequencies only experience them on the very periphery of perception. They may even occur to the listener as feelings rather than noises.
Compression tools are fed this information, and consequently, know exactly where to start when it’s time to cut some dead weight, but the audition threshold isn’t the only data they work with.
Temporal Masking is the next port of call. It refers to how the perception of sound can be affected by the presence of another. For instance, if two sounds ring out simultaneously or at least very close to one another, we tend to acknowledge whichever is loudest, disregarding less dominant noise.
As it’s far less likely our ears will perceive these buried sounds, a compression mechanism brandishes its digital scissors and snips away at them to reduce file size.
The last piece of the compression puzzle is known as Simultaneous Masking. We perceive all sound in one of two bands. Simultaneous Masking occurs when a spike in one band bleeds into the other, masking the softer presence of different frequencies.
Compression mechanisms locate these areas of audio bleed and clean things up.
Compression can consolidate an audio file by as much as ten times the original size, but it’s not for nothing, especially when the source file had a low bitrate to begin with. These algorithms tend to be more aggressive when working with lower quality audio around the 128kHz zone, as unnecessary details are already absent. The problem we face is that a lot of streaming services still use 128kHz files.
It can lead to low frequencies sounding flat and thin – perhaps even flappy – and higher frequencies losing clarity and the crispness we all adore in our music. For example, a closed hi-hat or snare might be buried under some aggressive gang vocals.
Audio File Formats – Which is the Best?
Which is the best file format for you depends on your wants and needs, but if I had to pick one and say it’s objectively the best, I’d go with FLAC. It’s open-source, so it’s free, it’s compressed, so you can kiss goodbye to storage-munching monster files, and it’s lossless, so you hear absolutely everything you’re supposed to hear.
That’s not to say lossy formats like MP3 and AAC are without merit. Their cut-throat approach to compression makes music easier to share, which is one of the best things you can do with music.
I’d be remiss if I didn’t also mention that for the audiophiles out there, there is no substitute for HD WAV or AIFF files. Granted, how much detail we can perceive is debatable, but these files still present music the way it was intended to be heard.
Audio File Formats – The Final Showdown
MP3 vs CD
CDs are still considered the standard for high audio quality, but they use uncompressed files. MP3s are much smaller, but offer diminished quality, and with internet speeds rising all the time, they’re likely to go the way of the humble cassette tape. Verdict – CD
MP3 vs FLAC
FLAC can half the file size of CD quality music without damaging audio quality. MP3s are slightly smaller, but they’re detrimental to audio quality. Verdict FLAC
MP3 vs AAC
Able to encode much higher frequencies than MP3, AAC is technically the better format, but as it’s not quite as ubiquitous yet, not as many devices are compatible with it. Verdict – MP3 (For Now)
We’ve come a long way in terms of our listening habits and music technology, but we’ve by no means reached the zenith of our exploration in this area.
Not too long ago, CDs were considered the be-all-end-all of formats. No one could have even imagined an MP3 let alone how they could affect music listening around the globe.
With formats like FLAC and AAC, we’ve done a great job refining audio files, but there’s no doubt in my mind that the best is yet to come.