acoustic impulse response measurement using speech and music signals john usher barcelona media –...

Post on 12-Jan-2016

224 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Acoustic impulse response measurement Acoustic impulse response measurement using speech and music signalsusing speech and music signals

John UsherJohn Usher

Barcelona Media – Innovation Centre | Av. Diagonal, 177, planta 9, 08018 Barcelona

John Usher -- In-situ RIR measurement using music and speech

2

Using adaptive filters to estimate acoustic IRs

In-situ acquisition of electro-acoustic IR, with audience.Continuous:

Fast enough for changing environment conditions.Use speech and music signal radiated from loudspeaker.AF for IR is nothing new! Used for:

Acoustic echo and feedback cancellation. Upmixing (2 → 5.1, 2 → 3D). ANC. Room EQ (using noise).

John Usher -- In-situ RIR measurement using music and speech

3

Audio source(voice or music) LS

AF update

Adaptive Filter (AF = h)

+

_

Mic.error

h~

Adaptive Filter is updated to model the acoustic IR so that the error signal level (power) is minimized.

Basic principle:

John Usher -- In-situ RIR measurement using music and speech

4

TD and FD smoothing

Homo. Deco.

Audio source(voice or music)

Filter inversion

RIR estimationh

EQ filter

h

~

Application for room EQ (filtered-x)

John Usher -- In-situ RIR measurement using music and speech

5

Localizing objects in a room

Emit speech warning from loudspeaker in room.Extract RIR using adaptive filter.Detect reflection onset timing, e.g. using running kurtosis.

John Usher -- In-situ RIR measurement using music and speech

6

Application for live sound: De-noising & spatial re-mixing

Audio source(voice or music) LS

AF update(NLMS)

Adaptive Filter (AF = h)

+

_

Mic.error

h~

Room signal

Audience signal (applause etc.)

Clean audio signal (from the desk)

John Usher -- In-situ RIR measurement using music and speech

7

Filter update algorithm (NLMS):

x(n) LS

Update

h(n)

+

_

Mic.e(n)

h~

1.

2.

y(n)

John Usher -- In-situ RIR measurement using music and speech

9

Small-room experiment set-up:

Audio source(voice or music)

Blah blah blah...

A. Source is loudspeaker reproducing noise, speech or music.Multichannel noise from loudspeakers.

B. Source is live spoken voice.Predict IR between two lav. mics.

Lav. 1 Lav. 2

Noise signal(white noise or babble)

John Usher -- In-situ RIR measurement using music and speech

10

Results

Error Criterion:1)Start with reference RIR (measured using swept-sine technique).

2)Allow Adaptive Filter to converge for 10 seconds to get AF spectra.

Calculate misalignment: mean of difference between the ref. and AF spectra (80 Hz-- 12 kHz):

John Usher -- In-situ RIR measurement using music and speech

11

Rate of Convergence

John Usher -- In-situ RIR measurement using music and speech

14

Comparison of filter spectra using noise, speech and music:(High SNR)

John Usher -- In-situ RIR measurement using music and speech

15

Robustness to SNR (25, 12, 3 dB SNR):

Masker = noise.

John Usher -- In-situ RIR measurement using music and speech

16

Robustness to SNR:Masker = babble

John Usher -- In-situ RIR measurement using music and speech

17

Comparison with DCFFT:

Dual Channel FFT method:

Following AES reviewer recommendation, compared with commercial DCFFT system (“SMAART”).

John Usher -- In-situ RIR measurement using music and speech

18

Comparison of NLMS vs DCFFT:

John Usher -- In-situ RIR measurement using music and speech

19

Effectiveness of AF RIR acquisition method with long RIRs.

6 RIRs:

Obtained from Dirac fed into Altiverb.

(NB: No background noise simulated.)

Football stadium, Caen Cathedral, church, EMT plate, Filmorch. Stage Berlin, Castle.

RT60: 9.6-1.1 secs.

1.2, 2.3, 3.5, 6.0, 7.8, 9.6.

John Usher -- In-situ RIR measurement using music and speech

20

What happens if we just model the early part of the IR?

… Not much: most of the spectral detail is in the early part.

For longer IRs, the adaptive filter should be longer.

Long

er R

T

John Usher -- In-situ RIR measurement using music and speech

24

Rate of Convergence for different RTs. 340 ms window, 32 x overlap.

Long

er R

T

John Usher -- In-situ RIR measurement using music and speech

25

RIR acquisition for small and large rooms :

Adaptive filter updated using NLMS and overlapped window.

Tested with RT60 = 0.5 -10 secs.

Using music, speech and noise as excitation signals.

Less accurate using live voice and two mics.

Convergence in <3 sec. (<2 dB mean error).

Little change in performance with SNRs down to 0 dB.

Conclusions:

John Usher -- In-situ RIR measurement using music and speech

26

Music vs speech:

Music: AF matches RIR 60 Hz—12 kHz.

Speech: AF matches RIR 100 Hz– 8 kHz.

No considerable improvement for filter sizes >340 ms. I.e. we only need to model first 1/8th of RIR to have a good approximation

of the spectrum.

Adaptive whitening algorithm (LPC residuals) can speed up convergence for highly coloured signals, but only in low SNRS.

Conclusions:

John Usher -- In-situ RIR measurement using music and speech

27

· In-situ continuous room EQ using filtered-x approach.

· Object localization using speech message.

(e.g. using running kurtosis).

· Re-mixing live music:

ambient sound separation using filter output and error signal (e.g. get clean signal + room ambiance + audience applause).

Applications:

John Usher -- In-situ RIR measurement using music and speech

28

Cheers!

John Usher

John Usher -- In-situ RIR measurement using music and speech

29

John Usher -- In-situ RIR measurement using music and speech

30

top related