so far: historical overview of speech technology basic components/goals for systems quick review of...

50
So far: Historical overview of speech technology basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition basics Next talk focuses on the nature of the signal: • Acoustic waves in small spaces (sources) •Acoustic waves in large spaces (rooms)

Post on 22-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

So far:

• Historical overview of speech technology basic components/goals for systems

• Quick review of DSP fundamentals• Quick overview of pattern recognition basics

Next talk focuses on the nature of the signal:

• Acoustic waves in small spaces (sources)•Acoustic waves in large spaces (rooms)

Page 2: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

A way to bridge from thinking about EE to thinking about acoustics:

Acoustic signals are like electrical ones, only much slower …

• Pressure is like voltage• Volume velocity is like current

(and impedance = Pressure/velocity)• For wave solutions, c is a lot smaller• To analyze, look at constrained models of common structures: strings and tubes

Acoustic waves - a brief intro

Page 3: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 4: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 5: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

+x + dx

=

Page 6: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

•So 2y 2y x2 t2

c2 is the wave equation for transverse vibration on a

string

Where c can be derived from the properties of the medium, and is the wave

propagation speed

=

Page 7: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

• Solutions dependent on boundary conditions• Assume form f(t - x/c) for positive x direction•Then f(t + x/c) for negative x direction•Sum is A f(t - x/c) + B f(t +x/c)

Page 8: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Lx

0

Open end

Excitation

Uniform tube, source on one end, open on the other

Page 9: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 10: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

c = f

Plane wave propagation for frequencies below ~4000 Hz

Page 11: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

•By looking at the solutions to this equation, we can show that c is the speed of sound

Page 12: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

t +

+

=

2

2

..

Page 13: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

+

u(0,t) = ej t = A e j(t - 0/c) - B e j(t + 0/c)

+ -

+ -

+

+ -

Let u+(t - x/c) = A e j(t - x/c) and u-(t + x/c) = B e j(t + x/c)

e jt

Page 14: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

• Now you can get equation 10.24 in text, for excitation U() ej t :

u(0,t) = ej t = A e j(t - 0/c) - B e j(t + 0/c)

p(L,t) = 0 = A e j(t - L/c) + B e j(t + L/c)

Problem: Find A and B to match boundary conditions

Solve for A and B (eliminate t)

u(x,t) = cos [(L-x)/c] U() ej t

cos [(L)/c]

Poles occur when:

= (2n + 1)πc/2L f = (2n + 1)c/4L

(upcoming homework problem)

Page 15: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

c = 340 m/s L = 17cm 4L = .68 m

f1

Page 16: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

First 3 modes of an acoustic tube open at one end

Page 17: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Effect of losses in the tube

•Upward shift in lower resonances

•Poles no longer on unit circle - peak values in frequency response are finite

Page 18: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Effect of nonuniformities in the tube

•Impedance mismatches cause reflections

•Can be modeled as a succession of smaller tubes

• Resonances move around - hence the different formants for different speech sounds

Page 19: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 20: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 21: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 22: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 23: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

=

Page 24: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

=

Page 25: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 26: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 27: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 28: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 29: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 30: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Acoustic reverberation

•Reflection vs absorption at room surfaces

•Effects tend to be more important than room modes for speech intelligibility

•Also very important for musical clarity, tone

Page 31: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

=

=

=

+ +

4

(uniformly distributed and diffuse)

Page 32: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

=

=

-

Decay of intensity when source is shut off (W=0)

Page 33: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

=

=

= =-

Page 34: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 35: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

4mV

=

Page 36: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 37: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 38: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 39: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 40: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 41: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 42: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

The phrase “two oh six” convolved with impulse response from .5 second RT60 room

Page 43: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Initial time delay gap = t0

Page 44: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Measuring room responses

•Impulsive sounds

•Correlation of mic input with random signal source (since R(x,y) = R(x,x) * h(t) )

•Chirp input

•Also includes mic, speaker responses

•No single room response (also not really linear)

Page 45: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 46: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 47: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Effects of reverb•Increases loudness

•“Early” loudness increase helps intelligibility

•“Late” loudness increase hurts intelligibility

• When noise is present, ill effects compounded

• Even worse for machine algorithms

Page 48: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition
Page 49: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Dealing with reverb

•Microphone arrays - beamforming

•Reducing effects by subtraction/filtering

•Stereo mic transfer function

• Using robust features (for ASR especially)

• Statistical adaptation

Page 50: So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition

Artificial reverberation

•Physical devices (springs, plate, etc.)

•Simple electronic delay with feedback

•FIR for early delays (think of “initial time delay gap” in concert halls), IIR for later decay

• Explicit convolution with stored response