audiocompression&mp3standard
TRANSCRIPT
-
8/2/2019 AudioCompression&MP3Standard
1/12
MPEG, the MP3 Standard,
and Audio Compression
Mark Kilgore and Jamie Wu
Mathematics of the Information Age
September 16, 2003
Audio Compression
n Basic Audio Coding.
n Why beneficial to compress?
n Lossless versus Lossy Compression.
n How are MP3s Compressed?
n What makes MP3 Compression Different?n What other formats lie in our future?
-
8/2/2019 AudioCompression&MP3Standard
2/12
PCM
Why Compress??
n Eliminate redundancy
n Most basic encoder/decoder is PCM
n Lots of redundancy b/c PCM representation is a basicsine wave
n If representing the sine wave based on frequencyrather than time, only need to store information
regarding frequency, amplitude, and phase in orderto represent the information
n Can reduce data without information loss
n Extends playing time, Allows for miniaturization andgreater equipment tolerance, Reduces cost
-
8/2/2019 AudioCompression&MP3Standard
3/12
Lossless vs. Lossy (Perceptive)
n Lossless coding allows perfect reconstructionof a signal (theoretically)
n Lossy Coding creates a more highly
compressed signal, but some unnecessaryfrequencies are eliminated
n Perceptually, however, lossy coding results in
no difference in how it SOUNDS to a person
n MP3s are lossy, but perceptually lossless
MPEG
n Moving Picture Experts Group
n Aim to create standards relating to synchronizedaudio and video compression
n MPEG-1
n MPEG-2
-
8/2/2019 AudioCompression&MP3Standard
4/12
MPEG-1 Block Diagrams
Topics Discussed in Detail After Diagrams
Layers I and II
Filter Bank (32
Sub-Bands)
0
31
DFT 512/1024
Hann WindowPsychoacoustic
Model
Uniform MidtreadQuanitzer
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
-
8/2/2019 AudioCompression&MP3Standard
5/12
DFT 2 * 1024
Hann Window
Filter Bank (32
Sub-Bands)
0
31 MDCT
PsychoacousticModel
Non-UniformMidtread Quantizer
Rate/Distortion Loop
0
511
Huffman Coding
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
Layer III
Time to Frequency Mapping
n Filters parse signal to K bands
n Quantized to a limited number of bits
n Noise put in bands barely audible
n Sent to decoder where sound is restored
x
H0
HK
K
K
InputOutputy0
yK
y0
yK
K
K
G0
GK
Encoder Decoder
x
-
8/2/2019 AudioCompression&MP3Standard
6/12
Z Transform
n Assists in splitting frequencies
n Discrete Time generalization of the Fouriertransform
n Important Properties
n Linearity
n Convolution Theorem
n Delay Theorem
n Can model all kinds of filter banks through it
n Representation of frequency content
DFT 2 * 1024Hann Window
Filter Bank (32
Sub-Bands)
0
31 MDCT
PsychoacousticModel
Non-UniformMidtread Quantizer
Rate/Distortion Loop
0
511
Huffman Coding
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
Layer III
-
8/2/2019 AudioCompression&MP3Standard
7/12
Time to Frequency Mapping
n Filters parse signal to K bands
n Quantized to a limited number of bits
n Noise put in bands barely audible
n Sent to decoder where sound is restored
x
H0
HK
K
K
InputOutputy0
yK
y0
yK
K
K
G0
GK
Encoder Decoder
x
MPEG Time to Frequency Mapping
[ ] [ ] ( ) 32
162
1cos
+
+=
nknhnhk [ ] [ ] ( )
+
+=
3216
2
1cos32
nknhngk
n Uses a filter of 32 bands, signal represented by 512samples
n The above equations allow for taking apart the signal(the h part of the time to frequency mapping diagram)and putting it back together (the g part of the time tofrequency mapping diagram)
Analysis Filter: Synthesis Filter:
511,,1,0;31,,1,0 KK == nk
-
8/2/2019 AudioCompression&MP3Standard
8/12
DFT 2 * 1024
Hann Window
Filter Bank (32
Sub-Bands)
0
31 MDCT
PsychoacousticModel
Non-UniformMidtread Quantizer
Rate/Distortion Loop
0
511
Huffman Coding
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
Layer III
PQMF & MDCT
n Both are methods of time to frequency mapping
n Pseudo-Quadrature Mirror Function
n Multiple Discrete Cosine Transformation
n Mathematically, they are equivalent
n PQMF involves using Z transforms to representthe amplitudes of the frequency
n MDCT involves performing a block transformusing a window to represent amplitudes
n These amplitudes are then quantized
-
8/2/2019 AudioCompression&MP3Standard
9/12
DFT 2 * 1024
Hann Window
Filter Bank (32
Sub-Bands)
0
31 MDCT
PsychoacousticModel
Non-UniformMidtread Quantizer
Rate/Distortion Loop
0
511
Huffman Coding
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
Layer III
Pyschoacoustic Model
n determines masking threshold for each sub band
n Uses human auditory property of AuditoryMasking
-
8/2/2019 AudioCompression&MP3Standard
10/12
Non-uniform Quantizer
n Analog to digital
n Quantizer: Maps amplitude values into finitenumber of bits
n Non-uniform: changes sample size according
to amplitude values
n parts of signal with lesser amplitude codedwith greater accuracy increases signal to
noise ratio (SNR)
DFT 2 * 1024Hann Window
Filter Bank (32
Sub-Bands)
0
31 MDCT
PsychoacousticModel
Non-UniformMidtread Quantizer
Rate/Distortion Loop
0
511
Huffman Coding
Coding of SideInformation
BitstreamFormatting
CodedAudio
Data
Layer III
-
8/2/2019 AudioCompression&MP3Standard
11/12
Huffman coding
n For better data compression, variable-lengthHuffman codes are used to encode the
quantized samples.
n quantized MDCT coefficients (for long blocks)arranged in order from lowest to highestfrequency
n whole range divided into 3 sections, each
coded with a different set of Huffman tables
Bitstream Formatting
n formats encoded quantized samples into anencoded bitstream final form in which the
compressed signal is transmitted.
-
8/2/2019 AudioCompression&MP3Standard
12/12
MPEG-4 and The Future?
n Incorporates speech and music compression
n More of an extension of MPEG-2compression techniques with independent
techniques geared specifically at coding forspeech content (some coding for meaning)
n Hasnt really taken off yet, only time will tell
n MPEG-2 AAC (Advanced Audio Coding) is
the audio format that is used if you downloadfrom the apple iTunes store