audio/speech signal processing an overview - iit...

18
Audio/Speech Signal Processing An Overview

Upload: vantram

Post on 06-May-2018

226 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio/Speech Signal Processing

An Overview

Page 2: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Application Fields

Sound Mixer: Music Recording

Audio Processor: FM Broadcasting

Synthesizer: Sound Synthesis

Voice call: Noise reduction and Speech Codecs

Page 3: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Signal Processing Tasks

• Audio/Speech Encoding/Decoding - Codecs

( DFT – Spectral Analysis, Filtering & Modifications)

• Audio effects( FIR/IIR - Digital Filtering & Spectral Modifications)

Page 4: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio/Speech Codecs

Page 5: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Voice Call flow through mobile

Echo CancellationNoise Reduction

Speech Codec

Page 6: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Approximate data transfer size for 60 sec Call

Raw Data: (Just analog to digital converted data)

Sampling rate: 8000 samples/secStorage space for one sample : 8bit

Total data size = Number of samples * Storage space for one sample = Samples/sec * Number of seconds * Storage space

= 8000 * 60 * 8 bits = 3840 Kbits

Bit rate = Samples/sec * Storage space for one sample = 64 Kbits/sec

Encoded/Compressed data: (DSP algorithm over sampled digital data)

Bit rate = 6.5 to 13 Kbits/sec (GSM Speech codecs output)

Data size = Transferred bits/sec * Number of seconds

= Bit rate * Number of seconds = 6.5 (13.5) * 60 = 390 to 810 Kbits

Page 7: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio Quality Measure

Audio 1

Audio 2

Audio 3

Raw Audio1441Kbps

Compressed audio at 128Kbps

Compressed audio at 32Kbps

Page 8: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Signal Compression in Frequency domain

Page 9: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio/Speech Codecs

Spectrogram : Frequency variation with time

Time

Frequency

128 Kbits MP3 Encoded Audio

32 Kbits MP3 Encoded Audio

1411 Kbits Raw Audio

Frequency

Frequency

Page 10: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio and Speech Codecs

Audio Frequency Range: 20Hz – 20KHz

Speech Frequency Range: 300Hz – 3500Hz

Speech Codecs: (Linear Prediction approach)

AMR, G.723

bitrate: 1.2 Kbits/sec

Sampling rate: 8 - 16Khz

Audio Codecs : (MDCT, Psychoacoustics analysis)

MP3, AAC

bitrate: 32-768 Kbits/sec

Sampling rate : 8 - 48Khz

Page 11: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio/Sound Effects – Android Apps

Page 12: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Audio Effects

• Intelligent Loudness Control (Automatic Gain Control)

• Wideband Automatic Noise Removal (WANR)

• Envelope/Stereo Processing

• Voice/Vocal Enhancement

• Base Enhancement

• Sibilant/Fricative Smoothing

• Dynamic Listening Fatigue Reduction (DLFR)

• Multi-Band Graphic Equalizer (Equalizer)

• Low Pass Filtering

Page 13: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Echo Effect : Information in Time domain

Signal delay:

y(t) = x(t) + decay*x(t-delay)

Raw Sound:

Echoed Sound:

Page 14: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Bass Enhancement :Information in Frequency domain

Subwoofer: reproduce low-pitched audio frequencies

known as bass (e.g.: Drum Sound)

Frequency range : 20-200Hz

Bass system frequency response

Page 15: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Resources

QA Community:

Signal Processing Stack exchange

http://dsp.stackexchange.com/

Open Source Contribution:

Audacity: Free Audio Editor and Recorder

audacity.sourceforge.net/

FFmpeg (solution to record, convert and stream audio and video)

https://www.ffmpeg.org/

Page 16: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Resources

Indian Research Start-ups:• ATC Labs, Noida• Violet 3D, Bangalore• Akshar Speech Technologies, Hyderabad

Research Labs:• Fraunhofer Institute, Germany• Dolby Laboratories• Philips Research• DTS/SRS Labs

Page 17: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Acknowledgment

Special thanks to,

Prof. Naren Naik

&

ATC Labs, Noida, India

Page 18: Audio/Speech Signal Processing An Overview - IIT Kanpurhome.iitk.ac.in/~nnaik/pdf/PPT_AudioSpeech.pdf · Signal Processing Tasks •Audio/Speech Encoding/Decoding - Codecs ( DFT –Spectral

Thanks for your time.