digital audio signal processing lecture 6: reverberation & dereverberation toon van waterschoot /...

Digital Audio Signal Processing Lecture 6: Reverberation & Dereverberation Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] [email protected]

Outline Introduction Problem statement Application scenarios Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues

Introduction: Problem statement Clean sound > Room acoustics > Reverberant sound desired: music example [clean ] [reverberant ] undesired: speech example [clean ] [reverberant ] [very reverberant ] Reverberation has desired/undesired impact on sound quality and speech intelligibility Research problems: artificial reverberation synthesis reverberation control/enhancement dereverberation D ESIRED U NDESIRED

Introduction: Application scenarios Scenario-1: Sound reproduction goal: sound control in acoustic environment (improved listening comfort/experience for audience) preprocessing strategy single-point > multiple-point > area (increasingly difficult) applications: public address, home/automotive audio systems preprocessing Note: in a sound reproduction scenario, dereverberation is often referred to as equalization Note: in a sound reproduction scenario, dereverberation is often referred to as equalization

Introduction: Application scenarios Scenario-2: Sound acquisition goal: sound control in electric environment (improved sound quality of microphone recordings) postprocessing strategy single-microphone > multi-microphone applications: speech recognition, hearing aids, recording, postprocessing Note: in contrast to AEC/AFC problems, (de)reverberation problem is not related to concurrent use of loudspeakers and microphones in same acoustic environment

Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues

Room acoustics: Overview Acoustic waves Key characteristics Non-parametric models Finite difference method Finite/boundary element method Image source method Ray tracing method Parametric models (Digital waveguide mesh) Impulse response Room transfer function Pole-zero model

Acoustic wave equation a valid sound field always satisties = sound pressure (function of space and time) speed of sound is Laplacian operator (carthesian coordinates) subject to boundary conditions example rigid wall: single point source: Room acoustics: Acoustic waves

Acoustic wave equation > Helmholtz equation obtained from acoustic wave equation by applying a Fourier transform over the time variable (*) k is wave number compose sound field as sum of room modes Room acoustics: Acoustic waves Example: 2-D room, 6 x 10 m rigid walls mode 1: 17.1 Hz =0.5*(343m/s)/(10m) mode 2: 28.5 Hz =0.5*(343m/s)/(6m) mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2) mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2) Example: 2-D room, 6 x 10 m rigid walls mode 1: 17.1 Hz =0.5*(343m/s)/(10m) mode 2: 28.5 Hz =0.5*(343m/s)/(6m) mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2) mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2) mode 1 mode 2 mode 3mode 5mode 4

Room acoustics: Key characteristics Reverberation time (Sabines formula) : room volume, total surface area of room average absorption coefficient of surfaces (*) time needed for 60 dB squared sound pressure decay Critical distance: source directivity room constant distance at which direct = reverberant sound energy Direct-to-reverberant ratio: source-observer distance ratio of direct vs. reverberant sound energy (*) 01, 0 for rigid wall (mirror), 1 for open window

Room acoustics: Non-parametric models (1) Finite difference time domain (FDTD) method spatio-temporal sampling on regular grid: partial derivatives (spatial & temporal) in wave equation approximated by finite difference operator FDTD wave equation with boundary conditions

Room acoustics: Non-parametric models (2) Finite element method (FEM) 4-step procedure to discretize boundary value problem 1. weak formulation of boundary value problem 2. integration by parts to relax differentiability requirements 3. subspace approximation of field and source functions 4. enforce orthogonality of approximation error to subspace subspace approximation relies on FEM basis functions: defined on arbitrarily constructed tetrahedral mesh having small spatial support FEM wave equation: Boundary element method (BEM) numerical approximation of Greens function Skip this part

Room acoustics: Non-parametric models (3) Ray tracing method sound waves represented by rays assumption of specular reflections (no diffraction), i.e. mirror- like reflection in which ray from a single incoming direction is reflected into a single outgoing direction rays can be traced from sound source to observer

Room acoustics: Non-parametric models (4) Image source method reflections modeled as direct rays from image source image sources = virtual sources located outside room multiple reflections modeled as high-order image sources

Room acoustics: Parametric models (1) Impulse response room response to gunshot source (impulse function) conceptually simple model, straightforward interpretation poor modeling efficiency (~10 3 params), high spatial variation direct coupling early reflections diffuse sound field

Room acoustics: Parametric models (2) Room transfer function (RTF) assumptions: shoe-box shaped room / rigid walls assumed modes solution of Helmholtz equation: = set of (non-negligible) room modes resonance frequency of m-th mode damping factor of m-th mode eigenfunction of m-th mode normalization constant of m-th mode

Room acoustics: Parametric models (3) Pole-zero model RTF suggests use of pole-zero model RTF denominator independent of source/observer positions gain factor minimum-phase zeros non-minimum-phase zeros common acoustical poles special cases: all-zero model = impulse response all-pole model: represents room resonances only

Outline Introduction Room acoustics Dereverberation Problem statement Overview of dereverberation methods Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues

Dereverberation: problem & overview PS: measurement noise not considered: Reverberation as an additive signal degradation Method 1: beamforming approach to dereverberation spatial separation of clean and reverberant sound Method 2: speech enhancement approach to dereverberation transform-domain separation of clean and reverberant sound Reverberation as a convolutive signal degradation Method 3: blind system identification and inversion approach to dereverberation: deconvolution of reverberant sound

Outline Introduction Room acoustics Dereverberation Method 1: Beamforming fixed beamforming adaptive beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues

Method 1: Introduction concept: spatial separation of direct and reverberant sound (cf. multi-microphone noise reduction) difficulties compared to noise reduction: spatial separation of direct sound and room reflections requires knowledge of reflection DOAs (~ room acoustics model) reverberant sound is diffuse (comes from "all possible" directions, including source direction) two distinct approaches: fixed delay-and-sum beamformer adaptive filter-and-sum beamformer

Method 1: Fixed DSB fixed DSB structure (cf. Topic-2): fixed DSB = matched filter (maximizing WNG) in the case spatially white noise (not entirely true for reverberation!) known sound source position ideal omni-directional microphones (cfr. Lecture-2)

Method 1: Fixed DSB expected DRR improvement of fixed DSB: source to m-th microphone distance, wave number m-th microphone position vector computed using statistical room acoustics (SRA) (with assumption that direct & (diffuse) reverberant component are uncorrelated, etc.) depends on source-array distance + microphone separation independent of reverberation time (!) (cfr improvement of DRR)

Method 1: Adaptive FSB adaptive FSB structure (cf. Topic-2): optimal solution (matched filter) depends on room model: ~ blind system identification & inversion (cf. below) : + (cfr. Lecture-2)

Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement cepstrum-based LPC-based spectrum-based Method 3: Blind system identification & inversion Conclusion & open issues

Method 2: Introduction concept: enhancement of reverberant speech by modeling & reducing reverberant sound in transform domain applicable to single- & multi-microphone sound acquisition choice of transform domain results in three approaches: cepstrum-based LPC-based spectrum-based

Method 2: Cepstrum-based concept: convolution in time domain ~ addition in cepstral (*) domain reverberation can be subtracted in cepstral domain cepstral subtraction: speech = low-quefrency room acoustics = high-quefrency cepstral analysis cepstral subtraction cepstral synthesis (*) use complex cepstrum (=invertible)

Method 2: LPC-based linear predictive coding of reverberant speech: reverberation hardly affects speech LPC coefficients reverberation largely affects LPC residual dereverberation reduces to LPC residual enhancement based on knowledge of speech production process + spatial averaging (using multiple microphones) LPC analysis LPC residual enhancement LPC synthesis LPC coefficients

Method 3: Spectrum-based concept: late reverberation ~ (broadband) additive noise spectral subtraction: estimate noise energy & compute subtractive gain function spectral subtraction assumes noise stationarity (cf. Lecture- 3) not valid for reverberation! estimation of "noise energy based on statistical model for late reverberation TF analysis Spectral subtraction TF synthesis late reverberation energy estimator Note: Straightforwardly extendable to combined dereverberation & noise suppression Note: Straightforwardly extendable to combined dereverberation & noise suppression

Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion all-zero model identification & inversion all-pole model identification & inversion Conclusion & open issues

Method 3: Introduction concept: two-step procedure step 1: identify room model (source > multiple microphones) step 2: invert room model highly non-trivial difficulties: source signal unknown > blind identification (non-) invertibility of room model model inversion sensitive to identification & numerical errors two approaches based on different room models: all-zero model all-pole model

starting point: cross-relation error / nullifying filters batch identification using EVD/SVD vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix filters denote erroneous zeros (which can be removed) zeros common to all RIRs cannot be identified high & unknown RIR order / poor conditioning Method 3: Blind system identification

PS: vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix

Method 3: Blind system identification PS: zeros common C(z) to all RIRs cannot be identified S(z)S(z) C(z)

Method 3: Inversion Multiple-input/output inverse theorem (MINT): exact solution exists if poor conditioning for near-common zeros Inversion sensitive to system identification errors

Method 3: Inversion matched filtering: can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1 st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect pre-echo

Method 3: Inversion matched filtering: can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1 st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect (can be alleviated by filter truncation) pre-echo

Method 3: All-pole model starting point: all-pole model with common acoustical poles a priori identification of all-pole model multi-channel LPC of estimated RIRs spatial averaging of single-channel LPC coefficients model inversion > fixed FIR filter (!)

Conclusion reverberation is complex physical phenomenon that can be modeled in a variety of ways research problems related to reverberation: artificial reverberation synthesis reverberation control/enhancement dereverberation dereverberation is still challenging problem! Method 1: beamforming Method 2: speech enhancement Method 3: blind system identification & inversion

digital audio signal processing lecture 6: reverberation & dereverberation toon van waterschoot /...

Documents

343ms10m mode

sound control

343ms6m mode

sound reproduction scenario

acoustic waves slide

acoustic environment

sound reproduction goal

rigid walls mode