clustered blind beamforming from ad-hoc …...clustered blind beamforming from ad-hoc microphone...
TRANSCRIPT
Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
Clustered Blind Beamforming From Ad-HocMicrophone Arrays
I. Himawan, I. McCowan and S. Sridharan
Presented by Yaron Doweck
Digital Speech Processing in Noisy Environments - 049035, Spring 2012Technion - Israel Institute of Technology
1 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
Outline
1 Background
2 Ad-Hoc Arrays
3 Clustering
4 Clustered Beamforming
5 Experimental Results
6 Proposed Extension
2 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysBackground
BackgroundBeamformer
Beamforming is an effective method of spatial filtering,differentiating desired signals from noise and interferencebased on their locations.Spatial filtering is possible due to the fact that the source andinterference signals reach each sensor at a different time.This difference is referred to as Time Difference of Arrival(TDOA).
3 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysBackground
BackgroundDelay-and-Sum Beamformer
Each sensor is delayed according to the source’s location.Delayed sensors are averaged.Mathematically:
In the time domain: y(t) = 1N
N�i=1
xi (t + τi ), where xi (t) - Input
signal from sensor i .
In the frequency domain: y(f ) =N�
i=1w∗
i(f )xi (f ), where
wi (f ) = 1N
e−j2πf τi .In matrix form: y(f ) = wd(f )HX (f ), where
X (f ) = [x1(f ), ..., xN(f )]T ,
wd(f ) =1
N�e−j2πf τ1 , ..., e−j2πf τN
�T
. 4 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysBackground
BackgroundMVDR Beamformer
Minimum Variance Distortionless Response.Minimizes the power at the array output subject to undistortedsignal response.The optimal weights vector is given by:
wMVDR(f ) =Γ−1
xx wd (f )
wd (f )HΓ−1xx wd (f )
Where (Γxx(f ))ij =Φxi xj (f )�
Φxi xi (f )Φxj xj (f ), and Φxx(f ) is the
cross-spectral density, and is estimated using exponential mean
Φtxixj
(f ) = αΦt−1xixj
(f ) + (1 − α)xi (f )x∗j (f )
5 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysAd-Hoc Arrays
Ad-Hoc Arrays
The calculation of weights vector for most beamformingmethods is dependent on the sensor geometry and the sourcelocation.In ad-hoc situations, however, when the microphones have notbeen systematically positioned, this information is not availableand beamforming must be achieved blindly.
6 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysAd-Hoc Arrays
Ad-Hoc Arrays
There are two general approaches to blindly estimate the weightsvector for beamforming:
1 Direct estimation of TDOA without regard to the microphoneand source locations.
2 Determine the unknown microphone positions through arraycalibration methods.
In this work, the first approach is examined.
7 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringMotivation
TDOA corresponds to the peak in the generalized cross correlation(GCC) function [Knapp and Carter(1976)]. A PHAT-weighted crosscorrelation between microphones was used to estimate TDOA :
TDOAij = argmaxτ
�R̂
ijGCC−PHAT (τ)
�
Where
R̂ijGCC−PHAT (τ) � IDFT
xi (f )x∗
j (f )���xi (f )x∗j (f )
���
8 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringMotivation
However, large inter-microphone spacing may lead to erroneousTDOA computation, effectively causing delay inaccuracies in thesteering vector of the beamformer.
Box Plot of TDOA Values Rel-
ative to Microphone 1. Micro-
phones 1-4 are closely located
(within 20cm distance). The
others are about 2m from mi-
crophone 1.
9 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringInter-Microphone Proximity Measure
Microphones proximity is measured using the magnitude squaredcoherence (MSC) defined as the following:
Cij(f ) =
��Φxixj (f )��2
Φxi (f )Φxj (f )
The proximity measure is calculated by integrating the MSC acrossfrequencies
TijMSC =
fmax�
0
Cij(f )
10 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringClustering Methods
Two clustering methods for forming clusters were examined:1 Rule-Based Clustering.2 Spectral Clustering.
11 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringRule-Based Clustering
TijMSC is compared to some threshold, T�.
Pairs that satisfy the criteria are grouped in the same cluster.Single element clusters are merged to the nearest cluster.
12 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustering
ClusteringSpectral Clustering
[Shi and Malik(2000)].The similarity matrix S is defined as
Sij = exp
�−α
�T ij
MSCN − 1
�2�
Define the matrix D as Dij =
�k
Skj i = j
0 else
Solve the generalized eigenvalues problem: (D − S)v = λDv
Perform k-means on the eigenvector with the second smallesteigenvalue.
13 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustered Beamforming
Clustered Beamforming
Following clustering of microphones, blind beamforming maybe performed for speech enhancement.Two methods are presented for beamforming using theclustering information:
1 Closest Cluster (CC) Beamforming
2 Weighted Cluster Combination (WCC) Beamforming
14 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustered Beamforming
Clustered BeamformingClosest Cluster (CC) Beamforming
Clusters are ranked according to the estimated distance fromthe speaker.
1 In each cluster, the closest microphone to the speaker is
selected as reference microphone. Define τk,ref as the TDOA
between cluster k and some reference cluster.
2 A measure of cluster spread, δ, is defined as the maximal
TDOA between the reference microphone and the other
microphones.
3 The cluster ranking is defined as
Dk = τk,ref + δk − Dmin, Dmin = min {τk,ref + δk}
MVDR beamforming is applied blindly for the closest cluster(Dk = 0) using the GCC TDOA estimation.
15 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysClustered Beamforming
Clustered BeamformingWeighted Cluster Combination (WCC) Beamforming
MVDR beamforming is applied blindly for each cluster usingthe GCC TDOA estimation.Delay-and-sum beamforming is applied blindly on the clusters’beams:
yout(t) =C�
k=1
αkyMVDR,k (t + τk)
WhereC�
k=1
αk = 1
Automatically determining cluster weights for combination hasnot been dealt with.
16 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysExperimental Results
Experimental ResultsMicrophone Clustering Evaluation
(A) Clustering based on true geometry, (B) Rule-based clustering,(C) Spectral clustering. 17 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysExperimental Results
Experimental ResultsMicrophone Clustering Evaluation
(A) Clustering based on true geometry, (B) Rule-based clustering,(C) Spectral clustering. 18 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysExperimental Results
Experimental ResultsSpeech Enhancement Evaluation - Configuration
Source: Single stationary speakerNoise: Babble noiseMicrophones signals were simulatedwith Habets’s RIR generator[Habets(2006)]SNR was measured by applyingbeamformer to clean signal and tonoise only signal separately
19 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysExperimental Results
Experimental ResultsSpeech Enhancement Evaluation
(A) - Speaker position 1(B) - Speaker position 2(C) - Speaker position 3
20 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed ExtensionNull Steering Beamformer
In null steering, the array weights are selected so that the beampattern has a null in a given direction.
(A) - Initial beam pattern.(B) - Null steered beam pattern.
21 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed ExtensionGoal
Apply a null steering beamformer blindly for ad-hoc array, that willimprove the SNR without a noticeable distortion.
Analyze and compare various null steering methods fordelay-and-sum and for MVDR beamformers.Examine the effect of clustering when applying null steering.
22 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed Extension
In addition to the direction of the speaker, null steeringrequires the knowledge of the direction to the interference.Blind null steering will use TDOA measurements only for bothspeaker and interference.The use of VAD is required in order to measure TDOA forspeaker and interference.
23 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed ExtensionInitial Results
Speaker position 1 - Speaker and interference are well separated.Delay-and-sum beamformer.
24 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed ExtensionInitial Results
Speaker position 3 - Speaker and interference are close, speaker’ssignal is distorted in the lower frequencies. Delay-and-sumbeamformer.
25 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone ArraysProposed Extension
Proposed ExtensionFuture Work
1 MVDR blind null steering.2 Compare the performance of different null steering methods.3 Minimize signal distortion by modifying the null constraint.4 Evaluate the effect of errors in the TDOA measurements.5 Examine the effect of clustering when applying null steering.
26 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
References
� I. Himawan, I. McCowan, and S. Sridharan, “Clustered blind beamforming fromad-hoc microphone arrays,” IEEE Transactions On Audio Speech And LanguageProcessing, vol. 19, no. 4, pp. 661–676, 2011.
� C. Knapp and G. Carter, “The generalized correlation method for estimation oftime delay,” IEEE Transactions on Acoustics, Speech and Signal Processing,vol. 24, no. 4, pp. 320–327, 1976.
� J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactionson Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000.
� G. Lathoud, J. marc Odobez, and D. Gatica-perez, “Av16.3: an audio-visual corpusfor speaker localization and tracking,” in in Proceedings of the 2004 MLMIWorkshop, S. Bengio and H. Bourlard Eds. Springer Verlag, 2005.
� A. P. Habets, “Room impulse response generator,” Technical Report, pp. 1–21,2006.
27 / 28
Clustered Blind Beamforming From Ad-Hoc Microphone Arrays
Questions?
28 / 28