an introduction to blind source separation kenny hild sept. 19, 2001
TRANSCRIPT
An Introduction to BlindSource Separation
Kenny Hild
Sept. 19, 2001
Problem Statement
• Communication System• Transmitter
• Medium
• Receiver
1. Data sent is unknown
2. Transfer function of medium may be unknown
3. Interference
Possible Solutions
• Beamforming• Uses geometric information
• Steer antenna array to a desired angle of arrival
• Filtering• Separate based on frequency information
• Blind Source Separation, BSS• Statistical beamforming
• Steer antenna array to directions based on statistics
Beamforming
• Suppose• Direction of arrival is 00 azimuth
• s1(n) = s2(n) = cos(wn)
• Transfer functions are pure delays
• Then• y(n) = x1(n) + x2(n-), = 0
• y(n) = cos(wn) + cos(wn)
+ cos(wn + ) + cos(wn + )
• y(n) = 2cos(wn)
+ 2cos(()/2)cos(wn + ()/2)
s1(n)s2(n)
x2(n)x1(n)
Filtering
• Suppose• Signal is low pass,
noise is white
• Signals are bandpass
• Then• Design LPF to remove
high frequencies
-3 -2 -1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Frequency (normalized rad/s)
noise
signal
-3 -2 -1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
signal #1signal #2
Assumptions
• Signals• Overlap in time
• Angle of arrival is unknown – prevents beamforming
• Overlap in frequency – prevents filtering
• Blind Source Separation• Does not assume knowledge of DOA
• Does not require signals to be separable in frequency domain
Applications
• Early diagnosis of pathology in fetus• Each EKG sensor contains a mixture of signals
• Desire is to separate out fetus’ heartbeat
• Hearing aids• Speech discrimination difficult with multiple speakers
• The observations are the signals at each ear
• Cellular communications• CDMA signals utilize overlapping frequency ranges
• Additional signals, multi-path deteriorate performance
Types of Mixtures
• Memory• Instantaneous
• Convolutive
• Noise• Linearity• Over/under-determined
Components of Adaptive Filter
• Topology• Instantaneous
• Convolutive
• Criterion• Optimization method
• Gradient descent
• Fixed point
Topology
• Over-determined, linear mixture• N > M
• H, W are matrices of ARMA filters• Types of topologies
• Frequency-domain
• Time-domain• Feedforward
• Feedback
• Lattice
Topology
• For Instantaneous Mixtures• H, W are matrices of constants
• Often W is broken down into 2-3 operations• Dimension reduction, (N x M) matrix D
• Spatial whitening, (N x N) matrix W
• Rotations, (N x N) matrix R
W = RWD (N x M)
x = Hs (M x 1)
y = Wx = WHs (N x 1)
Topology
• Spatial whitening• Makes outputs uncorrelated
• This is insufficient
• For separation• 4 possible rotations
-5 -4 -3 -2 -1 0 1 2 3 4 5-5
-4
-3
-2
-1
0
1
2
3
4
5
s1(n)
s2(n)
y1(n)y2(n)
Criterion
• Spatial whitening• x = Wx
• E[xxT] = IN
• W = xx
• J = ij (Rx(i,j) – IN(i,j))2
Rx=
Criterion
• Indeterminacies• Gain
• Permutation
• Rotations• Find characteristic of sources that is not true for any
mixture
Criterion
• Nullify correlations• Between nonlinear functions of the outputs
• Nonlinearity can be most any odd function• Cubic
• Hyperbolic tangent
• Requires source pdf’s to be even-symmetric
• Non-linear PCA• If data is sphered, stable points are ICA solution
• Minimizes joint entropy of nonlinear functions of outputs
Criterion
• Cancellation of HOS• 4th-order (kurtosis) is most common• If y1, y2, y3, y4 can be separated into 2 groups that are
mutually independent, 4th-order cumulant is zero• Must check all 4th-order cumulants• Statistical properties of cumulant estimators are poor
• Central limit theorem• Sum of independent, non-Gaussian sources approaches
Gaussian• Maximize (K-L) distance between marginal pdf and
Gaussian• Must know/estimate the kurtosis for each source
Criterion
• Maximum Likelihood• Must know/assume source distributions
• Minimize K-L divergence between output pdf’s and known/assumed source pdf’s
• Sensitive to outliers, model mismatch
• Maximize the information flow• Maximize joint entropy of outputs (of the
nonlinearities)
• Nonlinearities should be source cdf’s
• Equivalent to maximum likelihood
Criterion
• Mutual statistical independence• Oftentimes sources are independent
• Uncorrelatedness does not imply independence
• Canonical criterion
• Difficult to estimate• Solution includes an infinite-limit integral
• Marginal pdf’s estimated by truncated expansion about Gaussian