introduction of digital audio name: yao-cheng chuang phone: 0919005578 email :...

91
Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : [email protected]

Upload: adele-hart

Post on 26-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Introduction of Digital Audio

Name: Yao-Cheng ChuangPhone: 0919005578

Email : [email protected]

Page 2: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

History and Comparison

Speech and audio history.

Page 3: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Speech and Audio

Speech is sounds human can utter, but audio is what human can hear.

The basic bandwidth of speech is 4KHz. On the other hand, the basic bandwidth of audio is 22.05KHz.

The research of speech coding started earlier than audio coding.

Page 4: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

SPL: Sound Pressure Level

Page 5: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Speech Codec

The first speech codec standard is PCM (Pulse Code Modulation). It used simple sampling and quantization to represent digital speech information.PCM is 64Kbps.It is also called CCITT G.711.

(International Telephone and Telegraph Consultative Committee)

Page 6: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The goal of speech codec is low bit-rate.

ADPCM (Adaptive Differential PCM), also called CCITT G.721, is the representative of 32Kbps.

Because the neighborhood of speech sampling is usually similar, we use their differential to compress original data.

Page 7: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Later, CCITT G.723, G.726 appeared. They are also ADPCM but support many bit-rate selections, such as 40Kbps, 32Kbps, 24Kbps, 16Kbps.

CCITT G.727, G.728 are 16Kbps, and they are representative of middle bit-rate.

They use the technique of backward-CELP. This technique pays attention to short delay time.

CELP (Code Excited Linear Prediction) is 8Kbps.

Page 8: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MOS: Mean Opinion Score

Page 9: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Audio Codec

After speech codec, many companies and committees invest in audio codec.

ISO formulates a suite of video and audio standards called MPEG.

Dolby develops AC-1, AC-2, and AC-3.ISO (International Organization for Standardization)

MPEG (Moving Pictures Experts Group)

AC-3 (Audio Codec 3)

Page 10: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

DAB: Digital Audio BroadcastDCC: Digital Compact CassetteISDN: Integrated Services Digital NetworkMD: Micro Drive

Page 11: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Why Transform?

Two main reasons.

Page 12: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Benefit of Transformation

There are two main reasons to transform one kind of information or data from one domain to another domain.

1. Data compression.

2. Some operations can only be done in some domain.

Page 13: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Data Compression

Page 14: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Data Compression (cont.)

Page 15: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 16: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 17: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Disadvantage

It is not a good method for us to use transformation in audio data compression.

Our ears are more sensitive in some frequency (e.g. 1kHz - 5kHz).

This kind of data compression does not consider psychoacoustic factors.

Page 18: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Frequency Domain

Human ears hear sounds according to its frequency.

Some operations must be in frequency domain.

Many psychoacoustic studies are based on frequency domain.

Page 19: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Pulse-Code Modulation

Raw data of sound.

Page 20: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Modulation

Modulation is a means of encoding information for the purpose of transmission or storage.

Such as amplitude modulation (AM) and frequency modulation (FM) have long been used to modulate carrier frequencies with analog audio information for radio broadcast.

Page 21: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Amplitude Modulation

Page 22: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Frequency Modulation

Page 23: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

PWM / PPM

pulse-width modulation

pulse-position modulation

Page 24: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

PAM / PNM

pulse-amplitude modulation

pulse-number modulation

Page 25: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

PCM

pulse-code

modulation

It is the most commonly used modulation method

Page 26: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Lossless and Lossy Compression

Two main models of compression.

Page 27: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Terminology

E ( ): encoding algorithmD ( ): decoding algorithmM: original datam = E (M) -> encoding MM’ = D(m) -> decoding mIf M = M’ , then we call the algorithm as lossless compression, otherwise as lossy compression.

Page 28: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Compression Ratio

Compression ratio

p = (M – m) / M * 100%

Generally, lossy compression is better than lossless compression in compression ratio.

Page 29: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 30: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Psychoacoustics and Human Ear

Sounds of Human feeling.

Page 31: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Terminology

Loudness: Sound loudness is a subjective term describing the strength of the ear's perception of a sound.

Intensity: Sound intensity is defined as the sound power per unit area. The basic units are watts/m2 or watts/cm2 .

Page 32: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Threshold of Hearing

This is audibility curve. Below the curve, we can not hear anything.Human ears can hear the sound scale from 20-20000 Hz.Many sound intensity measurements are made relative to a standard threshold of hearing intensity:

I 0= 10-12 watts/m2 = 10-16 watts/cm2

Page 33: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Intensity Level

Decibel (dB) : The sound intensity I 1

may be expressed in decibels above the standard threshold of hearing I 0 .

Intensity level = 10 log10 ( I 1 / I 0 ) (dB)

I 0 : threshold of hearing

10-¹² watts / m²

I 1 : the intensity we want to measure

Page 34: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Threshold of Feeling

It is the upper bound curve that human can bear. Over this curve, human ears could be hurt.

It is not a horizontal line, either. In lower frequency, human ears are more sensitive, so the curve has a wave trough there.

Page 35: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 36: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Equal-Loudness Curve

At any equal-loudness curve, human hear the same loudness.

Equal-loudness curves are not horizontal lines.

Between threshold of hearing and threshold of feeling, there are infinite equal-loudness curves.

Page 37: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 38: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Human Hearing

Page 39: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 40: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 41: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 42: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Sound Masking

Time / frequency sound masking.

Page 43: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Frequency Masking

If many tones play simultaneously, some tones will be masked by others.

We can draw a frequency masking curve, and we can not hear sounds under the curve.

The curve’s slope steep at low frequency, but slow at high frequency.

Page 44: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 45: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Frequency Masking (cont.)

The louder masking sounds, the larger masked area.

If we use the frequency masking technique, we can reduce the coding bits.

Page 46: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 47: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Time Masking

If one sound is played, it may generate pre-masking and post-masking.

Post-masking is longer than pre-masking.

The larger the sound, the longer the masking.

Page 48: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 49: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3

MPEG1 Layer3

Page 50: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Introduction

MPEG: Moving Pictures Experts Group

MP3: MPEG-1 Layer-3

Why is MP3 so popular?

Open standard

Availability of hardware and software

Near CD (Compact Disk) quality

Fast Internet access for universities and

businesses

Page 51: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3 Format

An MPEG audio file is separated into smaller parts called frames. Each frame is independent. Each frame has its own header and audio information. There is no file header. Therefore, you can cut any part of MPEG audio file and play it correctly.

Page 52: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The frame header is constituted by the first four bytes (32 bits) in a frame. aaaaaaaa aaabbccd

eeeeffgh iijjklmmWe can know some information from the frame header, such as:

What are the version and layer? Is it protected by CRC (Cyclic Redundancy Check)? What are the bit-rate and frequency?

Page 53: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The tag is used to describe the MPEG audio file. It contains information about artist, title, album, publishing year, genre, and comments.

It is exactly 128 bytes long and is located at the end of the audio data.

AAABBBBB BBBBBBBB BBBBBBBB BBBBBBBB BCCCCCCC CCCCCCCC CCCCCCCC CCCCCCCDDDDDDDDD DDDDDDDD DDDDDDDD DDDDDEEE

EFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFG

Page 54: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3 Encoder

MDCT: Modified Discrete Cosine TransformFFT: Fast Fourier Transform

Page 55: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3 Encoder

768Kbps = 32K samples/second * 24 bits/sample

Page 56: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3 Decoder

iMDCT: inverse Modified Discrete Cosine Transform

Page 57: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MP3 Decoder

Page 58: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Psychoacoustic Principles

Critical band

Sound masking:

Time masking

Frequency masking

Page 59: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 60: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Filter Bank

Hybrid filter bank

Polyphase and MDCT (Modified Discrete Cosine Transform)

32 channels of polyphase sub-band

MDCT transforms each sub-band into 18 smaller channels.

Page 61: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MDCT

DFT

DCT

Page 62: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 63: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

CELP

Code Excited Linear Prediction

Page 64: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Background

Over the years many speech coding techniques have been developed starting from PCM and ADPCM (Adaptive Differential Pulse Code Modulation) in the 60s, to linear prediction in the 70s, and CELP in the late 80s and 90s.

Because we discover that speech spectra are similar at nearby samples, we use the method of prediction.

Page 65: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Person Model

For certain voiced sound, your vocal cords vibrate (open and close). The rate at which the vocal cords vibrate determines the pitch of your voice.

For certain fricatives and plosive (or unvoiced) sound, your vocal cords do not vibrate but remain constantly opened.

Page 66: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The shape of your vocal tract determines the sound that you make.

The shape of the vocal tract changes relatively slowly (on the scale of 10 ms to 100 ms).

The amount of air coming from your lung determines the loudness of your voice.

Page 67: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Math Model

Page 68: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Vocal Tract H(z) (LPC (Linear Predictive Coding) Filter)

Air u(n) (Innovations)

Vocal Cord Vibration V (voice)

Vocal Cord Vibration Period T (pitch period)

Fricatives and Plosives UV (unvoiced)

Air Volume G (gain)

Page 69: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

LPC

It stands for Linear Prediction Coefficients.

: spectra : error

LPC is the basic technique of CELP. Because CELP uses the prediction method, its bit-rate can be lower.

nin

L

iin eXaX

1

},...,,{ 21 nXXX

ne

Page 70: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

CELP Encoder

Page 71: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AC-3

Audio Codec 3

Page 72: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

What Is AC-3?

AC-3 refers to a multichannel music compression technology that has been developed by Dolby Laboratories.

Dolby Laboratories has used the term Dolby Digital to refer to this digital system in the film and theater industries, and has used the term Dolby Surround AC-3 to refer to the system in the home theater market.

Page 73: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The AC-3 can carry from 1 to 5.1 channels. It provides five full range channels (3 Hz to 20,000 Hz): three front channels (left, center, and right), plus two surround channels. A sixth bass-only effects channel (3 Hz to 120 Hz), also called sometimes “Low Frequencies Enhancement channel" (LFE).

Page 74: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

How Does AC-3 Work?

It uses lossy compressions. Like MP3 or AAC, AC-3 uses sound properties to achieve its compression.

Input uncompressed PCM samples must be 32, 44.1, or 48 kHz on up to 20 bits.

Page 75: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AC-3 Encoder

Page 76: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AC-3 Decoder

Page 77: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AAC

MPEG-2 Advanced Audio Coding

Page 78: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AdvertisementBecause of its exceptional performance and quality, Advanced Audio Coding (AAC) is at the core of the MPEG4, 3GPP (3rd Generation Partnership Project) specifications and is the new audio codec of choice for Internet, wireless, and digital broadcast arenas.AAC provides audio encoding that compresses much more efficiently than older formats such as MP3, yet delivers quality rivaling that of uncompressed CD (Compact Disk) audio.

Page 79: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Why AAC?

The driving force to develop AAC was the quest for an efficient coding method for surround signals, like 5-channel signals (left, right, center, left-surround, right-surround) as being used in cinemas today.

One aim of AAC was a considerable decrease of necessary bit-rate.

Page 80: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Low Delay

Low Delay audio coding is needed whenever some sort of communication is transmitted over low bandwidth channels in both directions, i.e. live broadcasts on TV (Television) or radio stations or in mobile phone networks (3G: 3rd Generation).

Both AAC and CELP have low delay property.

Page 81: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

AAC vs. MP3

MPEG-2 AAC is the consequent continuation of the truly successful coding method MPEG1 Layer-3.

The crucial differences between MPEG-2 AAC and its predecessor ISO/MPEG Audio Layer-3 are shown as follows:

Page 82: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Quantization: By allowing finer control of quantization resolution, the given bit rate can be used more efficiently. Prediction: A technique commonly established in the area of speech coding systems. It benefits from the fact that certain types of audio signals are easy to predict.Bit-stream format: The information to be transmitted undergoes entropy coding in order to keep redundancy as low as possible. The optimization of these coding methods together with a flexible bit-stream structure has made further improvement of the coding efficiency possible.

Page 83: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw
Page 84: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

WMA

Windows Media Audio

Page 85: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

What Is WMA?

It is an audio format by Microsoft.

Its file size is only one half the same data of MP3 file, but sound quality is similar to MP3.

Because it is proprietary, we hardly know its detailed codec.

Page 86: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

The Difference between ASF and WMA/WMV

The only difference between ASF files and WMA or WMV files are the file extensions and the MIME types.

The MIME type for a WMV file is video/x-ms-wmv, and for WMA it is audio/x-ms-wma. The MIME type for ASF is video/x-ms-asf. The basic internal structure of the files is identical.

MIME: Multipurpose Internet Mail Extensions

WMV: Windows Media Video

ASF: Active Streaming Format

Page 87: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

MIDI

Musical Instrument Digital Interface

Page 88: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

What Is MIDI?

MIDI is a method of communication between digital instruments.

It was created at 1982.

Unlike so called speech or audio, MIDI is similar to one kind of music score. It is unrelated to codec.

Page 89: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

We can write some musical notes on one MIDI file, then computer looks up a table for corresponding musical note and its sound.

Therefore, we just change the table, then we can set the sounds in violin, piano, or other instruments.

MIDI file is much smaller than general audio file.

Page 90: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Use What Is Fitting

MPEG1 Layer2 at 192 Kbps was in 7 of 8 cases significantly better than AAC at 96 Kbps, and in 6 of 8 cases better than AAC at 128 Kbps. Under the condition of twice cascading, the quality of AAC was much inferior to Layer2. It should also be noted, that there would be a significant difference in the processing time delay between Layer2 (which needs approximately 70ms) and AAC (about 300ms).

Page 91: Introduction of Digital Audio Name: Yao-Cheng Chuang Phone: 0919005578 Email : r93087@csie.ntu.edu.tw

Reference

K. C. Pohlmann, Principles of Digital Audio, Fourth Edition, McGraw-Hill, New York, 2000.

吳炳飛 , Audio Coding 技術手冊 , 全華科技圖書 , 台北 , 2004.

AudioCoding.com, “Welcome to the World of Audiocoding,” http://faac.sourceforge.net/oldsite/wiki/, 2005.