cs109 lecture 4: digital audiosnyder/cs109/lecture04.pdfcs109 lecture 4: digital audio ......
TRANSCRIPT
Computer Science
CS109 Lecture 4: Digital Audio
Basics of sound waves Digital sound Digitizing sound: Sampling and quantizing Consequences of choice of sample rate and bit depth CD quality sound vs MP3
No class next Monday, no discussions, no HW next week.
Computer Science
2
Hearing
We “hear” sound when a series of air compressions vibrate a membrane in our ear. The inner ear sends signals to our brain. The rate of this vibration is measured in Hertz, and the human ear can hear sounds in the range of roughly 20Hz - 20KHz.
Computer Science
3
Sound Wave Properties
Wavelength: distance between waves (affects pitch -- high or low sounds)
Amplitude: strength/height of waves (volume)
Frequency: the number of times a wave peak occurs in a second.
Wavelength = Speed of sound / wavelength E.g.: 340 meters per second .77 meters
= 440 Hz = note “A”
Computer Science
4
Microphones and Speakers
Microphones convert acoustical energy (sound waves) into electrical energy (the audio signal).
Speakers do the same thing in reverse: convert electrical energy into acoustical energy.
Computer Science
5
Audio Playback
A stereo sends an electrical signal to a speaker to produce sound.
The voltage in the signal varies in direct proportion to the sound wave: it is analog.
Digitizing a sound/electrical wave means converting it into a series of bits: Again, this involves sampling and quantizing.
11001010100101010101010
Computer Science
6
Important Note about Electronic Signals
An analog signal continually fluctuates in voltage up and down.
A digital signal is a series of numbers, with discrete amplitudes.
0 120 58 328 ....
Computer Science
Recall: Digitizing an Image
Sampling: Taking measurements (of color) at discrete locations within the image. Resolution: 16 samples per inch (in each direction)
Computer Science
Recall: Digitizing an Image
Quantizing: Measure the RGB color for each pixel, and record the 8+8+8 = 24 bit number for that sample. 224 = 16,777,216 possible colors per pixel (in TrueColor)
Computer Science
9
Digitizing Audio
Sampling: Decide on a sample rate (how many times a second to measure the wave); Example: CD quality sound is measured 44,100 Hertz = 44.1 KHz.
Computer Science
Choosing a Sampling Rate
Consider this waveform. What sampling rate should we choose?
Computer Science
Choosing a Sampling Rate
Consider this waveform, and these two sampling strategies. What’s going on here? A. B.
Samples: 100 -100 100 .... 0 0 0 0
100 0 -100
Computer Science
Nyquist Sampling Theorem
The Nyquist Sampling Theorem states that the sampling rate must be greater than twice the value of the highest frequency component of the analog signal.
Since humans can hear to about 20 KHz, this means that you would have to sample at 40 KHz to capture “all” the sound humans perceive, hence, the 44.1 KHz sample rate for CD quality sound.
Computer Science Digitizing Audio
Sampling: Decide on a sample rate (how many times a second to measure the wave); Example: CD quality sound is measured 44,100 times a second (44.1 KiloHertz) Decide on a “bit depth” (how many bits for each measurement); Example: CD quality sound is 16 bits, giving 216 = 65,536 possible values.
Computer Science Digitizing Audio
Quantizing: Measure the signal at the sample rate, representing the analog sound level with a number (must round the analog value to an integer). Each of these decisions, as with images, affects how well the digital information approximates the “perfect” analog signal. Let’s try an example on the board......
Computer Science
18
CD-Quality Audio Compact Disc audio is encoded by sampling:
§ 44,100 samples per second § 16 bits per sample per channel (2 channels) § thus: 44,100 * 16 * 2 = 1,411,200 bits per sec § Or about 10,600,000 bytes per minute
CD Audio uses about 10 megabytes per minute of audio. At 700 MG, a CD thus holds about 70 minutes of music in uncompressed form. At this rate, you would get about 1600 minutes = 26.6 hours of audio on an 16 GB iPhone 6. But this is without applying various compression algorithms.
Computer Science
20
Digital Audio Formats
Audio Formats § CDA, WAV, AU, AIFF, VQF, and MP3
MP3 (MPEG-2, audio layer 3 file) is most popular § Music signal is “simplified” using psychoacoustic
principles § The sequence of bits is then compressed using
algorithms from psychoacoustics.
Computer Science
MP3 Encoding Principles
§ Break file into small “frames” of a fraction of a second each;
§ Analyze each frame in terms of frequencies present;
§ Eliminate frequencies which would be masked anyway;
§ Recalculate the samples; and § Record the new signal. § Some times additional compression algorithms
are applied…. Encoding music files in MP3 is a “lossy” process; you lose some information---you could not recreate the original music file from the MP3 version!
Computer Science
24
Representing Audio Information
§ MP3 compression rates are based on how much bandwidth the final file will use to play music in real time:
§ 128kbps ~ 128,000 bits per second is “CD Quality” § Or about 960,000 bytes per minute § Much smaller rates can be used for voice, e.g., news
broadcasts, etc.
Compare to uncompressed CD audio – 10,600,000 bytes per minute! A CD holds about 700 MB (700,000,000 bytes)
§ About 70 minutes of CD audio format § Or about 911 minutes of MP3 audio format at 128 kbps
Punchline: MP3 compression can keep the same quality still reduces the size by ~90%.
Computer Science
Next Time: Lossless Compression
Long sequences of bits, especially for media, have many features which allow them to be encoded in much less space, with no loss of information:
Compression Algorithm
Decompression Algorithm
0101001010100101010101010111010000101110111011010101010101010101010101010
101001010101010101110100001011101110
Original File Compressed File
The encoding is “lossless”; compressing and then decompressing gives same file!