OK, I'm back!
All "quotes" below are paraphrases of points other users have made in this thread.
I've paraphrased so as not to blatantly call anyone out for being "wrong." Hopefully by doing this we can keep it impersonal and keep the focus of the conversation on audio. Multiple users' comments are represented below anyway, so few are innocent.
Cassettes reproduce up to 15kHz, and that's terribly inadequate
Yes, content above 15k can be meaningful for "air" and such, but bandwidth to 15k would really be "not so bad" all things considered. The inner grooves of an LP record typically have less top end extension than that, and nobody seems to complain when they talk about how 'amazing' their vinyl sounds. But all that aside-- bandwidth to 15k is incredibly optimistic for cassette, and requires everything to be functioning in tip-top shape on its best day. The reality is usually far
The more size you have for an analog medium-- the wider the tape tracks or the wider the spacing of the grooves on an analog disc-- the better the sound
This is a bit misleading. A wider tape track allows better performance chiefly in one area, and that's signal-to-noise ratio. More space on the tape = more magnetic domains, which means you can print a hotter level, which gets you further above the (fixed) noise floor of the tape. It doesn't really affect bandwidth or so-called "resolution" (which I've never seen a clear, empirical definition of).
Recording to tape always causes compression
This is another myth. If you print at a hot level, you reach full saturation-- all magnetic domains at a given point are aligned either positive or negative, and you can't print any hotter. This usually manifests as good-old-fashioned clipping, but if pressed, I could concede that it sometimes acts as a sort of peak limiting. If you're just below that point, I could imagine
that a bit of saturation could take place in advance of full clipping as the pool of randomly-oriented magnetic domains starts to run low, but in all honesty, I think the narrative of "tape compression" is vastly overstated.
In any case, if you print at a relatively low level, magnetic tape doesn't meaningfully compress at all. If you're using Dolby A or SR, you can print at a level low enough to where saturation isn't really an issue. Tape compression is a thing, I guess... it does exist under certain circumstances, but it's attribution as some magical "always on" thing that is synonymous with "tape sound" is largely internet folklore.
If you want the tape sound, just bounce to tape after you're done, or track to tape and dump into the DAW
Will it sound different if you do that? Sure. Will you capture some "tape flavor?" I concede. Will you retain all of the disadvantages
of tape (noise floor, head bump, etc) and therefore hear some "tape flavor?" Yes.
Is that the same as recording and staying in the analog domain? Absolutely not. And the better your tape machines are, the better they're aligned, and the better your analog recording practices... the more the output will sound like input (read: the less "tape flavor" you will get), anyway.
'CD Quality' is obsolete and was never good.
16/44.1k isn't obsolete, and it's still perfectly fine as a delivery medium for a two-track stereo mix that a consumer will listen to. Early
CDs were bad, but we've come a long way since the 1980s.
16 bit 44.1k is very poor as a multitrack
medium, where you'll be turning digital "faders" up and down, truncating and combining multiple tracks (more on that later). But as a two-track delivery medium? It's excellent when done well. And it's still the basis for many/most delivery media even as CDs slip into complete obsolescence.
CD quality is scientifically-proven as all a human can hear, and therefore it's totally adequate for multitracking
No such unambiguous scientific "proof" exists or could
exist (let's get that out of the way first). But 16 bit 44.1k when properly implemented and well-dithered
can achieve excellent
But excellent performance is not inherent in the format or there for the taking--you have to work for it.
16 bits @ 44.1k can be manipulated with some funny math to perform far
better than its raw specs would indicate. But even when all of that is done well, the DAC has to be well-implemented or its low-ish sampling rate could produce artifacts of one form or another in the audio band.
It's a bit deep to get into here, but briefly-- there's a lowpass filter below the Nyquist frequency (1/2 sampling rate, or 22.05kHz in this case). This filter is where most of the real-world issues with this sampling rate originate. If the filter isn't steep enough, you can get aliasing distortion (which sounds pretty ugly). If it acts on frequencies that are too low, you can get rolloff in the top of the audio band. If it's as steep as it needs to be, you can get problematic phase shift in the top of the band, enough to be audible, etc etc etc. Basically, implementation is everything.
And that's just for the two-track.
with "CD quality" audio, you have a new problem: the noise floor at 16 bits is fixed, and is (theoretically) 48dB worse than 24 bit. That's an astronomical difference. For a two-track master, this is largely irrelevant; 96dB of dynamic range (dithered to 120 or so) is practically plenty. But if you're combining several tracks, this extra noise floor can add up and be perceptible. If you're manipulating level in the digital domain (i.e. moving a virtual fader), the problems can rapidly compound.
This applies pressure to print at hotter levels to "beat the noise floor" just as you might on tape. However, clipping in the digital domain is WAY more obvious (and subjectively worse-sounding) than saturating tape. With 24 bit, you can print at much more conservative levels, and with 144dB of theoretical dynamic range, noise floor essentially never becomes an issue, even with low print levels, low fader levels, and many many combined tracks.
22.05 kHz is inadequate because musical instruments produce content above that band, even though humans can't hear it
No. This is a red herring. The shortcomings of 44.1k as a sample rate don't stem from its 22.05kHz bandwidth so much as they stem from an impractically-tight transition band for the nyquist filter.
To avoid aliasing distortion, all frequencies above 22.05kHz must be attenuated infinitely, ideally without disturbing the frequencies below 20kHz.
That's a VERY steep filter, and is very hard to design in the real world without some tradeoffs. At 96kHz, your transition band becomes very wide-- Nyquist is 48kHz, which gives you a full 28kHz (over an octave at this frequency) to transition from "full up" to "full off." That means the filter slope can be shallower (causes less phase shift), and later (less-likely to disturb in-band signal). This allows fewer design tradeoffs to be made. This--NOT irrelevant ultrasonic content--is the advantage of higher sampling rates
Nobody liked CDs when they came out because they've never been very good
Nah. CDs got a bad rap when they came out because implementation was abysmal. A properly-dithered
16 bit signal can give you >120dB of usable dynamic range. This is more than enough to resolve any subtleties in the quietest reverb tails you can imagine. It can get you well above the threshold of pain and far below the sound of your own blood pumping in your ears in an anechoic chamber.
A well-implemented Nyquist filter above 44.1k can also perform very well indeed, now, compared to when the technology was in its infancy four decades ago.
[picture of stair-stepped audio superimposed on a sine wave, as though to imply that the coarseness of these stair-steps determine resolution]
This isn't how digital audio works. It's a misleading graphic, and has led to untold levels of misunderstanding. It's not anyone's fault... it's just a bad 'internet narrative' that's facile and oversimplified. The Nyquist filter I was discussing above makes the stair-steps go away.
It seems counterintuitive, but it's true-- there are no "stair-steps" or "slices of time" in the output of a digital audio device. The complete, analog waveform is reconstructed
from the sampling data
, in all of its perfectly-smooth, continuous, analog glory. The samples are not the signal. The samples are data points that allow the complete analog waveform to be "drawn" in a fashion that's at least theoretically-identical to what it was on input.
The discrepancies that make people hate poorly-implemented digital ("coldness" "brittleness" "harshness" etc) are a result of sub-optimal implementation
. Very often, ironically, the "thin and brittle" characteristic of cheap digital audio can be traced to poor-sounding analog
stages within the devices!
We can't hear dithering, because our brains engage in a process called aliasing that does the work for us
I think there are some problems with terminology here.
Dithering absolutely can be perceived versus truncating a 16 bit digital file. Proper noise-shaped dither expands 16 bit's dynamic range by as much as 24dB. This means 24dB of recoverable signal that would otherwise be below the noise floor
of 16 bit audio. It might not be the most obvious thing in the world on a Discman in your car at 60mph, but if you're sitting down and listening for pleasure, on some program material you may perceive more detail as a result of this recovered information (particularly in terms of ambience, space, localization, reverb tails, etc).
Aliasing is not a process that our brains do. It's a form of distortion that occurs in a digital electronic device when there is insufficient low-pass filtering at the Nyquist frequency (1/2 the sample rate). It is, for all intents and purposes, unrelated to dither and therefore has no relevance in this context.
What aliasing actually is: when a PCM digital system is presented with frequencies greater than 1/2 its sampling rate, it produces an insufficient number of data points to reproduce those frequencies. This "tricks" the system into thinking that very high frequencies are actually lower
frequencies, and it reproduces those "phantom" lower
frequencies instead. This gives you sizzly artifacts that are correlated with signal and sound absolutely awful. If you've heard the sound, you will never forget it.
It's important to note that aliasing is a function of sample rate, and is completely unrelated to bit depth or dither.
Bit depth above 16 is unnecessary, because why capture sounds our ears can't hear?
Frequency response/bandwidth has nothing to do with the bit depth--we should begin there. Bit depth determines noise floor-- nothing more.
24 bit has a theoretical dynamic range of 144dB (that is, the difference between the noise floor and the loudest sound it can produce undistorted). 16 bit offers 96dB of dynamic range. 12 bit gives you 72dB, which is still quite usable in some applications (believe it or not, this is about the same as 2" 24 track running at 15 IPS!)
It seems crazy based on all the nonsense you will read on the internet about digital audio... but even 8 bit PCM digital, with its 42dB dynamic range, is still good enough for some applications!
But let's not get it twisted-- the frequency response of all of the above will be identical as long as the sample rate for each is the same. If the dynamic range of the material were to be sufficiently narrow, you would not even hear the difference between 8 and 24 bit, all else being equal.
BUT-- 144dB of theoretical dynamic range is nevertheless MUCH better than any of the alternatives, especially when combining multiple tracks (where noise is cumulative), and for real-world signals that have sounds which decay into silence.
That's enough for now. Nobody has even read this far anyway. And if you have, again I'll stress that I don't want to "call out" any member or members here for spreading bad information... but I likewise thought it would be a bummer to leave it out there unaddressed.