digital audio signal processing lecture 6: reverberation & dereverberation toon van waterschoot /...

40
Digital Audio Signal Processing Lecture 6: Reverberation & Dereverberation Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] .be [email protected]

Upload: marshall-lyons

Post on 23-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Slide 2
  • Digital Audio Signal Processing Lecture 6: Reverberation & Dereverberation Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] [email protected]
  • Slide 3
  • Outline Introduction Problem statement Application scenarios Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues
  • Slide 4
  • Introduction: Problem statement Clean sound > Room acoustics > Reverberant sound desired: music example [clean ] [reverberant ] undesired: speech example [clean ] [reverberant ] [very reverberant ] Reverberation has desired/undesired impact on sound quality and speech intelligibility Research problems: artificial reverberation synthesis reverberation control/enhancement dereverberation D ESIRED U NDESIRED
  • Slide 5
  • Introduction: Application scenarios Scenario-1: Sound reproduction goal: sound control in acoustic environment (improved listening comfort/experience for audience) preprocessing strategy single-point > multiple-point > area (increasingly difficult) applications: public address, home/automotive audio systems preprocessing Note: in a sound reproduction scenario, dereverberation is often referred to as equalization Note: in a sound reproduction scenario, dereverberation is often referred to as equalization
  • Slide 6
  • Introduction: Application scenarios Scenario-2: Sound acquisition goal: sound control in electric environment (improved sound quality of microphone recordings) postprocessing strategy single-microphone > multi-microphone applications: speech recognition, hearing aids, recording, postprocessing Note: in contrast to AEC/AFC problems, (de)reverberation problem is not related to concurrent use of loudspeakers and microphones in same acoustic environment
  • Slide 7
  • Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues
  • Slide 8
  • Room acoustics: Overview Acoustic waves Key characteristics Non-parametric models Finite difference method Finite/boundary element method Image source method Ray tracing method Parametric models (Digital waveguide mesh) Impulse response Room transfer function Pole-zero model
  • Slide 9
  • Acoustic wave equation a valid sound field always satisties = sound pressure (function of space and time) speed of sound is Laplacian operator (carthesian coordinates) subject to boundary conditions example rigid wall: single point source: Room acoustics: Acoustic waves
  • Slide 10
  • Acoustic wave equation > Helmholtz equation obtained from acoustic wave equation by applying a Fourier transform over the time variable (*) k is wave number compose sound field as sum of room modes Room acoustics: Acoustic waves Example: 2-D room, 6 x 10 m rigid walls mode 1: 17.1 Hz =0.5*(343m/s)/(10m) mode 2: 28.5 Hz =0.5*(343m/s)/(6m) mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2) mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2) Example: 2-D room, 6 x 10 m rigid walls mode 1: 17.1 Hz =0.5*(343m/s)/(10m) mode 2: 28.5 Hz =0.5*(343m/s)/(6m) mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2) mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2) mode 1 mode 2 mode 3mode 5mode 4
  • Slide 11
  • Room acoustics: Key characteristics Reverberation time (Sabines formula) : room volume, total surface area of room average absorption coefficient of surfaces (*) time needed for 60 dB squared sound pressure decay Critical distance: source directivity room constant distance at which direct = reverberant sound energy Direct-to-reverberant ratio: source-observer distance ratio of direct vs. reverberant sound energy (*) 01, 0 for rigid wall (mirror), 1 for open window
  • Slide 12
  • Room acoustics: Non-parametric models (1) Finite difference time domain (FDTD) method spatio-temporal sampling on regular grid: partial derivatives (spatial & temporal) in wave equation approximated by finite difference operator FDTD wave equation with boundary conditions
  • Slide 13
  • Room acoustics: Non-parametric models (2) Finite element method (FEM) 4-step procedure to discretize boundary value problem 1. weak formulation of boundary value problem 2. integration by parts to relax differentiability requirements 3. subspace approximation of field and source functions 4. enforce orthogonality of approximation error to subspace subspace approximation relies on FEM basis functions: defined on arbitrarily constructed tetrahedral mesh having small spatial support FEM wave equation: Boundary element method (BEM) numerical approximation of Greens function Skip this part
  • Slide 14
  • Room acoustics: Non-parametric models (3) Ray tracing method sound waves represented by rays assumption of specular reflections (no diffraction), i.e. mirror- like reflection in which ray from a single incoming direction is reflected into a single outgoing direction rays can be traced from sound source to observer
  • Slide 15
  • Room acoustics: Non-parametric models (4) Image source method reflections modeled as direct rays from image source image sources = virtual sources located outside room multiple reflections modeled as high-order image sources
  • Slide 16
  • Room acoustics: Parametric models (1) Impulse response room response to gunshot source (impulse function) conceptually simple model, straightforward interpretation poor modeling efficiency (~10 3 params), high spatial variation direct coupling early reflections diffuse sound field
  • Slide 17
  • Room acoustics: Parametric models (2) Room transfer function (RTF) assumptions: shoe-box shaped room / rigid walls assumed modes solution of Helmholtz equation: = set of (non-negligible) room modes resonance frequency of m-th mode damping factor of m-th mode eigenfunction of m-th mode normalization constant of m-th mode
  • Slide 18
  • Room acoustics: Parametric models (3) Pole-zero model RTF suggests use of pole-zero model RTF denominator independent of source/observer positions gain factor minimum-phase zeros non-minimum-phase zeros common acoustical poles special cases: all-zero model = impulse response all-pole model: represents room resonances only
  • Slide 19
  • Outline Introduction Room acoustics Dereverberation Problem statement Overview of dereverberation methods Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues
  • Slide 20
  • Dereverberation: problem & overview PS: measurement noise not considered: Reverberation as an additive signal degradation Method 1: beamforming approach to dereverberation spatial separation of clean and reverberant sound Method 2: speech enhancement approach to dereverberation transform-domain separation of clean and reverberant sound Reverberation as a convolutive signal degradation Method 3: blind system identification and inversion approach to dereverberation: deconvolution of reverberant sound
  • Slide 21
  • Outline Introduction Room acoustics Dereverberation Method 1: Beamforming fixed beamforming adaptive beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion Conclusion & open issues
  • Slide 22
  • Method 1: Introduction concept: spatial separation of direct and reverberant sound (cf. multi-microphone noise reduction) difficulties compared to noise reduction: spatial separation of direct sound and room reflections requires knowledge of reflection DOAs (~ room acoustics model) reverberant sound is diffuse (comes from "all possible" directions, including source direction) two distinct approaches: fixed delay-and-sum beamformer adaptive filter-and-sum beamformer
  • Slide 23
  • Method 1: Fixed DSB fixed DSB structure (cf. Topic-2): fixed DSB = matched filter (maximizing WNG) in the case spatially white noise (not entirely true for reverberation!) known sound source position ideal omni-directional microphones (cfr. Lecture-2)
  • Slide 24
  • Method 1: Fixed DSB expected DRR improvement of fixed DSB: source to m-th microphone distance, wave number m-th microphone position vector computed using statistical room acoustics (SRA) (with assumption that direct & (diffuse) reverberant component are uncorrelated, etc.) depends on source-array distance + microphone separation independent of reverberation time (!) (cfr improvement of DRR)
  • Slide 25
  • Method 1: Adaptive FSB adaptive FSB structure (cf. Topic-2): optimal solution (matched filter) depends on room model: ~ blind system identification & inversion (cf. below) : + (cfr. Lecture-2)
  • Slide 26
  • Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement cepstrum-based LPC-based spectrum-based Method 3: Blind system identification & inversion Conclusion & open issues
  • Slide 27
  • Method 2: Introduction concept: enhancement of reverberant speech by modeling & reducing reverberant sound in transform domain applicable to single- & multi-microphone sound acquisition choice of transform domain results in three approaches: cepstrum-based LPC-based spectrum-based
  • Slide 28
  • Method 2: Cepstrum-based concept: convolution in time domain ~ addition in cepstral (*) domain reverberation can be subtracted in cepstral domain cepstral subtraction: speech = low-quefrency room acoustics = high-quefrency cepstral analysis cepstral subtraction cepstral synthesis (*) use complex cepstrum (=invertible)
  • Slide 29
  • Method 2: LPC-based linear predictive coding of reverberant speech: reverberation hardly affects speech LPC coefficients reverberation largely affects LPC residual dereverberation reduces to LPC residual enhancement based on knowledge of speech production process + spatial averaging (using multiple microphones) LPC analysis LPC residual enhancement LPC synthesis LPC coefficients
  • Slide 30
  • Method 3: Spectrum-based concept: late reverberation ~ (broadband) additive noise spectral subtraction: estimate noise energy & compute subtractive gain function spectral subtraction assumes noise stationarity (cf. Lecture- 3) not valid for reverberation! estimation of "noise energy based on statistical model for late reverberation TF analysis Spectral subtraction TF synthesis late reverberation energy estimator Note: Straightforwardly extendable to combined dereverberation & noise suppression Note: Straightforwardly extendable to combined dereverberation & noise suppression
  • Slide 31
  • Outline Introduction Room acoustics Dereverberation Method 1: Beamforming Method 2: Speech enhancement Method 3: Blind system identification & inversion all-zero model identification & inversion all-pole model identification & inversion Conclusion & open issues
  • Slide 32
  • Method 3: Introduction concept: two-step procedure step 1: identify room model (source > multiple microphones) step 2: invert room model highly non-trivial difficulties: source signal unknown > blind identification (non-) invertibility of room model model inversion sensitive to identification & numerical errors two approaches based on different room models: all-zero model all-pole model
  • Slide 33
  • starting point: cross-relation error / nullifying filters batch identification using EVD/SVD vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix filters denote erroneous zeros (which can be removed) zeros common to all RIRs cannot be identified high & unknown RIR order / poor conditioning Method 3: Blind system identification
  • Slide 34
  • PS: vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix
  • Slide 35
  • Method 3: Blind system identification PS: zeros common C(z) to all RIRs cannot be identified S(z)S(z) C(z)
  • Slide 36
  • Method 3: Inversion Multiple-input/output inverse theorem (MINT): exact solution exists if poor conditioning for near-common zeros Inversion sensitive to system identification errors
  • Slide 37
  • Method 3: Inversion Multiple-input/output inverse theorem (MINT): exact solution exists if poor conditioning for near-common zeros Inversion sensitive to system identification errors
  • Slide 38
  • Method 3: Inversion matched filtering: can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1 st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect pre-echo
  • Slide 39
  • Method 3: Inversion matched filtering: can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1 st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect (can be alleviated by filter truncation) pre-echo
  • Slide 40
  • Method 3: All-pole model starting point: all-pole model with common acoustical poles a priori identification of all-pole model multi-channel LPC of estimated RIRs spatial averaging of single-channel LPC coefficients model inversion > fixed FIR filter (!)
  • Slide 41
  • Conclusion reverberation is complex physical phenomenon that can be modeled in a variety of ways research problems related to reverberation: artificial reverberation synthesis reverberation control/enhancement dereverberation dereverberation is still challenging problem! Method 1: beamforming Method 2: speech enhancement Method 3: blind system identification & inversion