independent component analysis via distance covariance · 2015-11-09 · i independent component...

43
Independent Component Analysis via Distance Covariance David S. Matteson Department of Statistical Science Cornell University [email protected] www.stat.cornell.edu/ ~ matteson Joint work with: Ruey S. Tsay, Booth School of Business, University of Chicago Sponsorship: National Science Foundation 2014 June 3 David S. Matteson ([email protected]) dCovICA 2014 June 3 1 / 38

Upload: others

Post on 25-May-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysisvia Distance Covariance

David S. MattesonDepartment of Statistical Science

Cornell University

[email protected]/~matteson

Joint work with: Ruey S. Tsay, Booth School of Business, University of Chicago

Sponsorship: National Science Foundation

2014 June 3

David S. Matteson ([email protected]) dCovICA 2014 June 3 1 / 38

Page 2: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Introduction

Introduction

I Notation

I Measuring Mutual Multivariate IndependenceI A Pairwise Measure: Distance Covariance

I Extending the Pairwise Measure

I An Empirical Measure: U-statistics

I Independent Component AnalysisI Estimation of Independent Components: dCovICA

I Strong Consistency

I Testing for the Existence of Independent Components

I Examples

David S. Matteson ([email protected]) dCovICA 2014 June 3 2 / 38

Page 3: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Introduction Notation

Notation

I Let 〈t, s〉 denote the scalar product of vectors t, s ∈ Rd

I For complex-valued functions φ(·) :I The complex conjugate of φ is denoted by φ

I The absolute square |φ|2 is defined as φφ

I The Euclidean norm of x ∈ Rd is denoted simply as |x |

I Primed variables X ′ and X ′′ are each an independent copy of XI X ,X ′ and X ′′ are independent and identically distributed (iid)

David S. Matteson ([email protected]) dCovICA 2014 June 3 3 / 38

Page 4: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Characteristic Functions

Characteristic Functions and Mutual IndependenceI A random vector X ∈ Rd with distribution Fx has a unique

characteristic function

φx(t) = E (e i〈t,X 〉) for t ∈ Rd

I The characteristic function always existsI e i〈t,X〉 is bounded and |φx(t)| ≤ 1 for all t

I Let X1, . . . ,Xc denote a given partition of the elements of X into ccomponents, for some 2 ≤ c ≤ d and let t = (t1, . . . , tc)

I Then X1, . . . ,Xc are mutually independent if and only if for all t:

φx(t) = E(e i

Pcj=1〈tj ,Xj 〉) = E

( c∏j=1

e i〈tj ,Xj 〉)

=c∏

j=1

E(e i〈tj ,Xj 〉

)= φx1(t1) · · ·φxc (tc)

David S. Matteson ([email protected]) dCovICA 2014 June 3 4 / 38

Page 5: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Characteristic Functions

Characteristic Functions and Mutual IndependenceI A random vector X ∈ Rd with distribution Fx has a unique

characteristic function

φx(t) = E (e i〈t,X 〉) for t ∈ Rd

I The characteristic function always existsI e i〈t,X〉 is bounded and |φx(t)| ≤ 1 for all t

I Let X1, . . . ,Xc denote a given partition of the elements of X into ccomponents, for some 2 ≤ c ≤ d and let t = (t1, . . . , tc)

I Then X1, . . . ,Xc are mutually independent if and only if for all t:

φx(t) = E(e i

Pcj=1〈tj ,Xj 〉) = E

( c∏j=1

e i〈tj ,Xj 〉)

=c∏

j=1

E(e i〈tj ,Xj 〉

)= φx1(t1) · · ·φxc (tc)

David S. Matteson ([email protected]) dCovICA 2014 June 3 4 / 38

Page 6: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Characteristic Functions

Characteristic Functions and Mutual IndependenceI A random vector X ∈ Rd with distribution Fx has a unique

characteristic function

φx(t) = E (e i〈t,X 〉) for t ∈ Rd

I The characteristic function always existsI e i〈t,X〉 is bounded and |φx(t)| ≤ 1 for all t

I Let X1, . . . ,Xc denote a given partition of the elements of X into ccomponents, for some 2 ≤ c ≤ d and let t = (t1, . . . , tc)

I Then X1, . . . ,Xc are mutually independent if and only if for all t:

φx(t) = E(e i

Pcj=1〈tj ,Xj 〉) = E

( c∏j=1

e i〈tj ,Xj 〉)

=c∏

j=1

E(e i〈tj ,Xj 〉

)= φx1(t1) · · ·φxc (tc)

David S. Matteson ([email protected]) dCovICA 2014 June 3 4 / 38

Page 7: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Pairwise Independence

A Pairwise Measure of Multivariate IndependenceFor random vectors X ∈ Rdx and Y ∈ Rdy

I φxy denotes the joint characteristic function of (X ,Y )

I φx and φy denote the marginal characteristic functions of X and Y

A pairwise measure of multivariate independence may be defined as

V2(X ,Y ; w) =

∫Rdx +dy

|φxy (t, s)− φx(t)φy (s)|2 w(t, s) dt ds,

in which w(t, s) denotes an arbitrary positive weight function.

We consider

w∗(t, s) =

(π(1+dx )/2

Γ((1 + dx)/2)

π(1+dy )/2

Γ((1 + dy )/2)|t|1+dx |s|1+dy

)−1

,

in which Γ(·) is the complete gamma function.

David S. Matteson ([email protected]) dCovICA 2014 June 3 5 / 38

Page 8: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Pairwise Independence

A Pairwise Measure of Multivariate IndependenceFor random vectors X ∈ Rdx and Y ∈ Rdy

I φxy denotes the joint characteristic function of (X ,Y )

I φx and φy denote the marginal characteristic functions of X and Y

A pairwise measure of multivariate independence may be defined as

V2(X ,Y ; w) =

∫Rdx +dy

|φxy (t, s)− φx(t)φy (s)|2 w(t, s) dt ds,

in which w(t, s) denotes an arbitrary positive weight function.

We consider

w∗(t, s) =

(π(1+dx )/2

Γ((1 + dx)/2)

π(1+dy )/2

Γ((1 + dy )/2)|t|1+dx |s|1+dy

)−1

,

in which Γ(·) is the complete gamma function.David S. Matteson ([email protected]) dCovICA 2014 June 3 5 / 38

Page 9: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Distance Covariance

Distance CovarianceSuppose (X ,Y ), (X ′,Y ′), (X ′′,Y ′′)

iid∼ Fx ,y

Let

I(X ,Y ) = E |X − X ′||Y − Y ′| − E |X − X ′||Y − Y ′′|−E |X − X ′′||Y − Y ′|+ E |X − X ′|E |Y − Y ′|

Theorem

For any pair of random vectors, X ∈ Rdx and Y ∈ Rdy

I If E (|X |+ |Y |) <∞I Then I(X ,Y ) = V2(X ,Y ; w∗)

I And I(X ,Y ) ∈ [0,∞), with I(X ,Y ) = 0 if and only if X ⊥⊥ Y

See Szekely and Rizzo (2007) and Matteson (2014).

David S. Matteson ([email protected]) dCovICA 2014 June 3 6 / 38

Page 10: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Distance Covariance

Distance CovarianceSuppose (X ,Y ), (X ′,Y ′), (X ′′,Y ′′)

iid∼ Fx ,y

Let

I(X ,Y ) = E |X − X ′||Y − Y ′| − E |X − X ′||Y − Y ′′|−E |X − X ′′||Y − Y ′|+ E |X − X ′|E |Y − Y ′|

Theorem

For any pair of random vectors, X ∈ Rdx and Y ∈ Rdy

I If E (|X |+ |Y |) <∞I Then I(X ,Y ) = V2(X ,Y ; w∗)

I And I(X ,Y ) ∈ [0,∞), with I(X ,Y ) = 0 if and only if X ⊥⊥ Y

See Szekely and Rizzo (2007) and Matteson (2014).David S. Matteson ([email protected]) dCovICA 2014 June 3 6 / 38

Page 11: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence Empirical Pairwise Measure

An Empirical Pairwise Multivariate MeasureLet (X,Y) = {(Xi ,Yi ) : i = 1, . . . , n} be an iid sample from the jointdistribution of random vectors X ∈ Rdx & Y ∈ Rdy , E (|X |+ |Y |) <∞.

Define In(X,Y) = T1,n + T2x ,nT2y ,n − T3,n,

in which

T1,n =

(n

2

)−1∑i<j

|Xi − Xj ||Yi − Yj |,

T2x ,n =

(n

2

)−1∑i<j

|Xi − Xj |,

T2y ,n =

(n

2

)−1∑i<j

|Yi − Yj |,

T3,n =

(n

3

)−1 ∑i<j<k

(|Xi − Xj ||Yi − Yk |+ |Xi − Xk ||Yi − Yj |

).

David S. Matteson ([email protected]) dCovICA 2014 June 3 7 / 38

Page 12: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Mutual Multivariate Independence An Alternative Measure

An Alternative MeasureI In(X,Y) depends on marginal distributions

I Apply probability integral transformation (PIT)

marginal CDF Fxi ,Fyj : R→ [0, 1], define Ui = Fxi (Xi ),Vj = Fyj (Yj)

I(U,V ) = 0 iff X and Y are independent

I The F are unknown

I Use F , e.g. marginal ranks

I Let Ui = Fxi (Xi ), Vj = Fyj (Yj)

Lemma

In(U, V)a.s.−→ I(U,V ), and if H0 : X ⊥⊥ Y , then nIn(U, V)

D−→ r.v.

David S. Matteson ([email protected]) dCovICA 2014 June 3 8 / 38

Page 13: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence A Joint Multivariate Test

A Simultaneous Test for Multivariate Serial Dependence

I Assuming yt ∈ Rd is strictly stationarity and E|yt | <∞,I(yt , yt−k) measures lag-k multivariate serial dependence

I Let Yt−k = {yt−1, . . . , yt−k}, then I(yt ,Yt−k) jointly measuresmultivariate serial dependence up to lag-k

I Joint hypothesis for multivariate serial dependence

H0 : φyt ,Yt−k= φytφYt−k

stationarity⇐⇒ φyt ,yt−1,...,yt−k= φytφyt−1 · · ·φyt−k

PIT⇐⇒ φut ,ut−1,...,ut−k= φutφut−1 · · ·φut−k

I We define our test statistic as

Qd(Y, k) = (n − k) In(ut , Ut−k)

I Approximate p-values via permutation

David S. Matteson ([email protected]) dCovICA 2014 June 3 9 / 38

Page 14: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence A Joint Multivariate Test

Asymptotic Distribution

Let ζk(a, b) denote a mean zero complex Gaussian process with covariancefunction

Rk(c , c0) =(φu(a− a0)− φu(a)φu(a0)

)(φuk

(b − b0)− φuk(b)φuk

(b0))

for c = (a, b), c0 = (a0, b0) ∈ R× Rk

I φu characteristic function of ut, and φuk= φu · · ·φu

I Under H0,

Qd(Y, k)D−→ ||ζk(a, b)||2ω

I For univariate time series (d=1) this is an asy. distribution free test

David S. Matteson ([email protected]) dCovICA 2014 June 3 10 / 38

Page 15: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence Unemployment Rates

Seasonally Adjusted Monthly Unemployment Rates (%)CA, FL, IL, MI, OH, & WI, from January 1976 through August 2010

Year

Mon

thly

Une

mpl

oym

ent R

ate

Perc

enta

ge

1975 1980 1985 1990 1995 2000 2005 2010

46

810

1214

16

CAFLILMIOHWI

David S. Matteson ([email protected]) dCovICA 2014 June 3 11 / 38

Page 16: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence Unemployment Rates

Standardized Change in Monthly Unemployment Rate %First difference series, scaled by monthly standard deviations

Year

Sta

ndar

dize

d C

hang

e in

Mon

thly

Une

mpl

oym

ent R

ate

Perc

enta

ge

1975 1980 1985 1990 1995 2000 2005 2010

−4−2

02

46 CA

FLILMIOHWI

David S. Matteson ([email protected]) dCovICA 2014 June 3 12 / 38

Page 17: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence Unemployment Rates

Testing for Serial Dependence

I Transform to stationary yt

I First difference series, scaled by monthly standard deviations

I Q6(Y, k = 12) = 30.92 with p-value ≈ 0

I Fit a vector autoregression of order three, via OLS

yt = β0 + β1yt−1 + β2yt−2 + β3yt−3 + et

I Calculate residuals et

I Q6(E, k = 12) = 0.11 with p-value ≈ 0.1

I ⇒ Linear model is sufficient

David S. Matteson ([email protected]) dCovICA 2014 June 3 13 / 38

Page 18: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Serial Dependence Unemployment Rates

Residual Series, Monthly Unemployment Rate %No significant serial dependence

Year

Res

idua

l for

VA

R(3

) Mon

thly

Une

mpl

oym

ent R

ate

Perc

enta

ge

1975 1980 1985 1990 1995 2000 2005 2010

−4−2

02

4

CAFLILMIOHWI

David S. Matteson ([email protected]) dCovICA 2014 June 3 14 / 38

Page 19: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Multivariate Analysis

Residual Distribution: Univariate and Bivariate

||||| | || ||| || |||| || || || | || ||| | ||| || || |||| || | |||| ||| || | |||| | || || ||| ||| |||| | | || || ||| || || | || || || | ||| |||

MI

−3 −2 −1 0 1 2

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

−4 −2 0 2 4

−1.0

−0.5

0.00.5

1.01.5

2.0

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−3−2

−10

12

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

|| | || | |||| || || | || || |||| ||| | || | ||| || || || |||| || | || ||| || ||| || | ||| | |||||| || || || | ||| ||| || ||| ||| |||| || || ||

OH

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

|| | ||| || || | || |||| || || || | || | | || |||||| ||| || ||| |||| | || || ||||| |||| | || | || | || | | ||| || | |||| ||| | || || ||| || || ||

CA

−1.0

−0.5

0.00.5

1.01.5

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−1.0 0.0 1.0 2.0

−4−2

02

4

●●

●●

●●

●●

●●

●●● ●● ●

●●

●●

●●

●●

●●

●●

●●

● ●● ●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●●●● ●

●●

● ●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

● ●

●●

●●

●●

−1.0 0.0 0.5 1.0 1.5

●●

●●

●●

●●

●●

●●● ●● ●

●●

● ●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

| || || | ||| || | |||||| |||||| || ||||| ||| || || || || | | | |||||| || | |||| |||| | ||||| | || || ||| | ||| ||| |||| || || ||| ||| || |

WI

David S. Matteson ([email protected]) dCovICA 2014 June 3 15 / 38

Page 20: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Multivariate Analysis Pairwise to Mutual Independence

Extending the Pairwise Independence MeasureFor a random vector X ∈ Rd

I Let X1, . . . ,Xc denote a partition of X into 2 ≤ c ≤ d components

I Let Xk+ = (Xk+1, . . . ,Xc), t = (t1, . . . , tc), and tk+ = (tk+1, . . . , tc)

I DefineI(X ) =

c−1∑k=1

I(Xk ,Xk+)

Theorem

For any random vector X ∈ Rd , with disjoint components X1, . . . ,Xc , ifE |X | <∞, then I(X ) ∈ [0,∞), with I(X ) = 0 if and only if X1, . . . ,Xc

are mutually independent.

See Matteson (2014), and note that for every t ∈ Rd

|φx(t)− φx1(t1) · · ·φxc (tc)| ≤c−1∑k=1

|φxk ,xk+ (tk , tk+)− φxk(tk)φxk+

(tk+)|

David S. Matteson ([email protected]) dCovICA 2014 June 3 16 / 38

Page 21: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Notation

Independent Component Analysis

For iid vector observations yt , assume independent components (ICs) st

exist, such thatyt = Mst

I M denotes the mixing matrix

I Validity of assumption is tested

For simplicity

I O, an uncorrelating matrix

I zt = Oyt , uncorrelated observations

Then st = M−1yt = M−1O−1zt ≡Wzt

in which W = M−1O−1 is referred to as the separating matrix

David S. Matteson ([email protected]) dCovICA 2014 June 3 17 / 38

Page 22: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Assumptions

Assumptions

I yt = (y1t , . . . , ydt)T a d-dimensional random vector

I yt has continuous distribution function

I ytiid∼ Fy

I E|yt |2 <∞I E (yt) = 0

I st = (s1t , . . . , sdt)T a random vector of ICs

I E{sit} = 0 and Var{sit} = 1, ∀i

I Separating matrix W is orthogonal

I I = Cov(st) = WCov(zt)WT = WWT

I Parameterized by p = d(d − 1)/2 vector θ of rotation angles, Wθ

David S. Matteson ([email protected]) dCovICA 2014 June 3 18 / 38

Page 23: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Estimation

dCovICA Estimator

I Define st(θ) = Wθzt , and S(θ) = [st(θ)] (n × d)

I Let k+ = {k + 1, . . . , d}

I Define the dCovICA objective function as

In(θ; Z) = In(S(θ)) =d−1∑k=1

In(Sk(θ),Sk+(θ))

I The dCovICA estimator is θn = argminθ In(θ; Z)

I Define Wn = Wθ=bθn

and st = st(θn) = Wnzt

David S. Matteson ([email protected]) dCovICA 2014 June 3 19 / 38

Page 24: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Estimation

An Alternative Estimator

I Empirical measure depends on marginal distributions

I For continuous r.v. apply probability integral transformation (PIT)

I Marginals Fsk : R→ [0, 1], define uk,t = Fsk (sk,t), for each sk

I The Fsk are unknownI Empirical counterpart, use marginal ranksI However, objective function not continuousI Estimators must depend on locations {sk,t}nt=1, not just relative location

I Apply kernel smoothing to approximate Fsk with a continuous function

I LetFsk ,n,hn

(s) =n∑

t=1

G

(sk,t − s

hn

)

G : integral of a density kernel; hn: random, data-dependent bandwidth

David S. Matteson ([email protected]) dCovICA 2014 June 3 20 / 38

Page 25: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Consistency

PITdCovICA EstimatorLet st(θ) = Wθzt , uk,t(θ) = Fsk (θ),n,hn

(sk,t(θ)) , and U(θ) = [ut(θ)]

I Objective function:

Jn(θ; Z) =d−1∑k=1

In(Uk(θ), Uk+(θ))

I Estimator: θn = argminθ Jn(θ)

Let uk(θ) = Fsk (θ) (sk(θ)) and

J (θ; Z) =d−1∑k=1

I(uk(θ), uk+(θ)) (1)

AssumptionThe random bandwidth hn is a measurable function of {yk,t}nt=1 such that

hna.s.→ 0; further, the kernel function G is Lipschitz continuous

TheoremIf there exists a minimizer θ0 of Equation (1), then θn

a.s.−→ θ0

David S. Matteson ([email protected]) dCovICA 2014 June 3 21 / 38

Page 26: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Consistency

PITdCovICA EstimatorLet x1, . . . , xn be a sample from an (unknown) distribution F ∈ F in which Fdenotes the class of all continuous distribution functions. Let

Fn(x) =1

n

n∑i=1

1(−∞,x](xi )

denote the empirical cumulative distribution function (ECDF). Letx1:n ≤ . . . ≤ xn:n denote the order statistics from the sample x1, . . . , xn, and forsome b < n let

hn,b = min{xj :n − x(j−b):n : j = b + 1, ..., n} (2)

denote the minimum bth order spacing among the order statistics. Define thekernel estimator

Fn,b(x) =1

n

n∑i=1

G

(x − xi

hn,b

),

in which we assume G (x) = 0 for x ≤ −1/2, G (x) = 1 for x ≥ 1/2, G (0) = 1/2,

and G (x) is continuous and nondecreasing in (−1/2, 1/2).

David S. Matteson ([email protected]) dCovICA 2014 June 3 22 / 38

Page 27: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Consistency

PITdCovICA Estimator

Then, for i = 1, . . . , n we note that |Fn(xi :n)− Fn(xi :n)| = b2n , and

supx∈R|Fn,b(x)− F (x)| = oP(n−1).

This Lemma extends the result of Zielinski (2007), and its proof followsfrom similar arguments. Examples for G (x) include Φ(logit(x + 1

2 )) andΦ(tan(xπ)), in which Φ(x) denotes the standard Gaussian CDF. We usethe former in our simulations and applications. We found that b = b

√nc

worked well in our simulations, with little sensitivity to this choice.

If the assumptions of the previous Theorem and Lemma hold, if Fyk(y) is

twice continuously differentiable ∀k , with derivatives fyk(y) and fyk

(y),respectively, if E |fyk

(yk)|2 <∞ and E |fyk(yk)|2 <∞, ∀k , and if

E[∂∂θ n

(θ)∣∣θ=θ0

]= 0, then |θn − θ0| = OP(n−1/2).

David S. Matteson ([email protected]) dCovICA 2014 June 3 23 / 38

Page 28: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Consistency

PITdCovICA Estimator

Then, for i = 1, . . . , n we note that |Fn(xi :n)− Fn(xi :n)| = b2n , and

supx∈R|Fn,b(x)− F (x)| = oP(n−1).

This Lemma extends the result of Zielinski (2007), and its proof followsfrom similar arguments. Examples for G (x) include Φ(logit(x + 1

2 )) andΦ(tan(xπ)), in which Φ(x) denotes the standard Gaussian CDF. We usethe former in our simulations and applications. We found that b = b

√nc

worked well in our simulations, with little sensitivity to this choice.

If the assumptions of the previous Theorem and Lemma hold, if Fyk(y) is

twice continuously differentiable ∀k , with derivatives fyk(y) and fyk

(y),respectively, if E |fyk

(yk)|2 <∞ and E |fyk(yk)|2 <∞, ∀k , and if

E[∂∂θ n

(θ)∣∣θ=θ0

]= 0, then |θn − θ0| = OP(n−1/2).

David S. Matteson ([email protected]) dCovICA 2014 June 3 23 / 38

Page 29: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Inference

Statistical Inference

Although the minimizers θn and θn always exists, an important questionfor all ICA methods is: do the independent components exist or not?

To evaluate this issue statistically, we construct a test of the null hypothesis

H0 : Y = SMT

in which M is non-singular and S1, . . . ,Sd are mutually independent

I Recall, nIn(S) converges in distribution under mutual independence

I M is unknown in practice, hence S not directly observed

I The limiting distribution of nIn(S) is different than that of nIn(S)

David S. Matteson ([email protected]) dCovICA 2014 June 3 24 / 38

Page 30: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Inference

Inference Based on Resampling

Define Mn = U−1

n W−1θn

as the estimated mixing matrix

I Un is the estimated uncorrelating matrix

I θn is either the dCovICA or PITdCovICA estimator

The proposed resampling scheme consists of the following steps:

1. For each k = 1, . . . , d , jointly sample the entire sequenceS∗k = (s∗1,k , . . . , s

∗n,k)T by randomly permuting the n elements of Sk

2. Let Y∗ = S∗MTn

3. Replace the sample Y with Y∗

4. Given Y∗, estimate M∗ via same procedure used to estimate Mn

5. Define S∗

= Y∗M∗−T

6. Calculate I∗n(S) = In(S∗)

David S. Matteson ([email protected]) dCovICA 2014 June 3 25 / 38

Page 31: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Independent Component Analysis Testing Existence

A Test for the Existence of Independent Components

The Y∗ are generated according to H0

I Components of Y∗M−Tn are genuine Independent Components

Under H0, given the sample Y: nI∗n(S)D∼ nIn(S)

Repeat the resampling scheme N times:

I Reject H0 if nIn(S) is greater than the (Nα)th largest value of thenI∗n(S), in which α ∈ (0, 1) is the size of the test

I Accounts for uncertainty in estimating Independent Components givenZn, and uncertainty in estimating Zn

Procedure independent of estimation, may use with any ICA method

David S. Matteson ([email protected]) dCovICA 2014 June 3 26 / 38

Page 32: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Application Unemployment Rates

PCA vs. ICA: Filtered Unemployment Rate et

Test statistic and approximate p-value for joint test of ICs

nIn(·) yt et zt st

Test Statistic 39.70 5.27 0.41 -0.42Approx. p-value 0 0 0 0.91

David S. Matteson ([email protected]) dCovICA 2014 June 3 27 / 38

Page 33: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Application PCA vs. ICA

PCA vs. ICA: Probability Integral Transformation

−2 −1 0 1 2 3

−6−4

−20

2PCA

Z1

Z 2

●●● ●

●●●●

●●●

●●

●● ●

●●●

●● ●

●● ●●

●●● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●●

●●●

●●

●● ●●● ●●●

●●

●●●

−2 −1 0 1 2

−20

12

3

ICA

S1

S 2

●●

●●

●●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

0.00.4

0.8

PCA

F(Z1)

F(Z2)

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

0.00.4

0.8

ICA

F(S1)

F(S2)

David S. Matteson ([email protected]) dCovICA 2014 June 3 28 / 38

Page 34: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Application Interpretation of ICs

Interpretation of ICs

CA : e1 = −0.89s1 − 0.11s2 + 0.36s4 + 0.23s6

FL : e2 = −0.24s1 − 0.10s2 − 0.83s5 + 0.48s6

IL : e3 = −0.32s2 − 0.87s3 + 0.27s4 + 0.24s6

MI : e4 = −0.33s1 − 0.85s2 − 0.12s3 − 0.16s5 − 0.34s6

OH : e5 = −0.11s1 − 0.65s2 − 0.19s4 + 0.45s5 + 0.56s6

WI : e6 = −0.48s2 + 0.32s3 + 0.81s4

I s2 is related to each state

I s2 has positive relationship with seasonally adjusted GDP

I Supports hypothesis −s2 is national component of unemployment rate

David S. Matteson ([email protected]) dCovICA 2014 June 3 29 / 38

Page 35: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Simulation

SimulationI Compare dCovICA and PITdCovICA with SymR-est, AsyR-est,

FastICA, KDICA

I 18 source distributions: Student-t, uniform, exponential, mixtures...

I n = 1, 000; random mixing M0; pre-whitened via PCA; 1,000 reps

I Error Metric (Ilmonen et al. 2010):

D(M0, M) =1√

d − 1infC∈C||CM−1M0 − Id ||F

|| · ||F : Frobenius normI The infimum is taken such that D is invariant to the three ambiguities

associated with ICA by defining

C = {C ∈M : C = P±B for some P± and B}I M: set of d × d non-singular matricesI P±: a signed permutation matrixI B: a diagonal matrix with positive elementsI R package JADE (Nordhausen et al. 2011)

David S. Matteson ([email protected]) dCovICA 2014 June 3 30 / 38

Page 36: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Simulation d = 2

a b c

d e f

g h i

j k l

m n o

p q r

●●

●●

Distribution

Loga

rithm

of M

ean

Err

or fr

om T

rue

M (

+/−

Sta

ndar

d E

rror

)

●●

●●

a b c d e f g h i j k l m n o p q r

log(

0.05

)lo

g(0.

1)lo

g(0.

15)

log(

0.25

)lo

g(0.

4)lo

g(0.

6)

Method

dCovICAPITdCovICASym R−est ICA via dCovICAAsy R−est ICA via dCovICAFastICAKDICA

David S. Matteson ([email protected]) dCovICA 2014 June 3 31 / 38

Page 37: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Simulation d = 4, 8, 16

Simulation

Table: Mean error distance (×100), Equation (3), approximate standard error,and mean computation time in seconds (s) for N = 1,000 simulations in R4,R8,and R16 with sample size n = 1,000: randomly selecting with replacement fromthe 18 distributions.

Joint Estimation AsymmetricdCovICA PITdCovICA FastICA KDICA R-est ICA

Mean Error 8.075 8.036 19.381 9.951 17.664R4 Standard Error 0.150 0.132 0.512 0.362 0.415

Mean Time (s) 8.96 23.77 0.02 0.18 1.67Mean Error 8.600 8.628 32.070 20.439 N/A

R8 Standard Error 0.040 0.039 0.476 0.479 N/AMean Time (s) 26.77 64.80 0.07 5.54 N/A

Mean Error 8.884 8.878 48.396 39.930 N/AR16 Standard Error 0.019 0.019 0.364 0.473 N/A

Mean Time (s) 66.56 124.86 0.12 156.61 N/A

David S. Matteson ([email protected]) dCovICA 2014 June 3 32 / 38

Page 38: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Applications Yield Curve

Interest Rate Yields1

23

45

6Daily Treasury Rates (%) 9/1998 − 8/2008

Year

Rat

e (%

)

2000 2002 2004 2006 2008

David S. Matteson ([email protected]) dCovICA 2014 June 3 33 / 38

Page 39: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Applications Yield Curve

Yield Curve

David S. Matteson ([email protected]) dCovICA 2014 June 3 34 / 38

Page 40: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Applications Volatility Modeling

Volatility: Σt = MCov{st |Ft−1}M′ = Mdiag{σ2it}M′

Time

0 500 1000 1500 2000 2500

510

2030

Time

0 500 1000 1500 2000 2500

−10

−5

05

1015

Time

0 500 1000 1500 2000 2500

020

4060

80

Time

0 500 1000 1500 2000 2500

−20

−10

010

20

Time

0 500 1000 1500 2000 2500

510

1520

2530

Time

0 500 1000 1500 2000 2500

−15

−5

05

10

David S. Matteson ([email protected]) dCovICA 2014 June 3 35 / 38

Page 41: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Applications fMRI data

ICA of fMRI, ADHD-200 Global, & AcknowledgmentsTYP

1

1

ADHD

c(1:10)

c(1

:10

)

DIFF

1

1

c(1:10)

c(1

:10

)

1

1

−2−101234

c(1:10)

c(1

:10

)

1

1

c(1:10)

c(1

:10

)

1

c(1

:10

)

Benjamin B. Risk andDavid Ruppert, Cornell University

ADHD-200 Global CompetitionWinning team:Johns Hopkins UniversityBrian CaffoCiprian CrainiceanuAni EloyanFang HanHan LiuJohn MuschelliMary Beth NebelTuo Zhao

The Neuro Bureau:neurobureau.projects.nitrc.org/ADHD200

David S. Matteson ([email protected]) dCovICA 2014 June 3 36 / 38

Page 42: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Conclusions Thank you!

ConclusionsI Joint test for multivariate serial dependence

I A measure for mutual multivariate independence

I A statistical framework for independent component analysis

I A statistical test for checking the existence of independent components

I We combine nonparametric probability integral transformation with ageneralized nonparametric whitening method

I Limiting properties of the proposed estimator under weak conditions

Future Work

I Generalize to dependent data

I Extend to high dimensional data

I Derive asymptotic critical values for general test statistics

I Explore new applications

David S. Matteson ([email protected]) dCovICA 2014 June 3 37 / 38

Page 43: Independent Component Analysis via Distance Covariance · 2015-11-09 · I Independent Component Analysis I Estimation of Independent Components:dCovICA I Strong Consistency I Testing

Bibliography

BibliographyBach, F., and Jordan, M. (2003), “Kernel Independent Component Analysis,” The Journal of Machine Learning Research,

3, 1–48.Chen, A. (2006), Fast kernel density independent component analysis,, in Proceedings of the 6th international conference on

Independent Component Analysis and Blind Signal Separation, Springer-Verlag, pp. 24–31.Chen, A., and Bickel, P. (2005), “Consistent Independent Component Analysis and Prewhitening,” IEEE Trans. Signal

Processing, 53(10), 3625–3632.Eriksson, J., and Koivunen, V. (2003), “Characteristic-Function-Based Independent Component Analysis,” Signal Process,

83, 2195–2208.Hallin, M., and Mehta, C. (2013), “R-Estimation for Asymmetric Independent Component Analysis,” arXiv preprint

arXiv:1304.3073, .Hastie, T., and Tibshirani, R. (2003), “Independent Components Analysis Through Product Density Estimation,” Advances in

Neural Information Processing Systems, 15, 665–672.Hastie, T., and Tibshirani, R. (2010), ProDenICA: Product Density Estimation for ICA using Tilted Gaussian Density Estimates.

R Package Version 1.0.Hyvarinen, A., and Oja, E. (1997), “A Fast Fixed-Point Algorithm for Independent Component Analysis,” Neural Computation,

9(7), 1483–1492.Ilmonen, P., Nordhausen, K., Oja, H., and Ollila, E. (2010), “A New Performance Index for ICA: Properties, Computation and

Asymptotic Analysis,” Latent Variable Analysis and Signal Separation, pp. 229–236.Ilmonen, P., and Paindaveine, D. (2011), “Semiparametrically efficient inference based on signed ranks in symmetric independent

component models,” Annals of Statistics, 39(5), 2448–2476.Matteson, D. S., and Tsay, R. S. (2011), “Dynamic Orthogonal Components for Multivariate Time Series,” Journal of the

American Statistical Association, 106(496), 1450–1463.Matteson, D. S., and Tsay, R. S. (2012), “Independent Component Analysis via U-Statistics,” Under Review, .Nordhausen, K., Cardoso, J.-F., Oja, H., and Ollila, E. (2011), JADE: JADE and ICA Performance Criteria. R Package Version

1.0-4.Nordhausen, K., Oja, H., and Paindaveine, D. (2009), “Signed-rank tests for location in the symmetric independent component

model,” Journal of Multivariate Analysis, 100(5), 821–834.R Development Core Team (2010), R: A Language and Environment for Statistical Computing, R Foundation for Statistical

Computing, Vienna, Austria.Szekely, G. J., and Rizzo, M. L. (2009), “Brownian Distance Covariance,” Annals of Applied Statistics, 3(4), 1236–1265.Szekely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007), “Measuring and Testing Dependence by Correlation of Distances,”

Annals of Statistics, 35(6), 2769–2794.Zielinski, R. (2007), “Kernel Estimators and the Dvoretzky-Kiefer-Wolfowitz Inequality,” Applicationes Mathematicae, 34(4), 401.

David S. Matteson ([email protected]) dCovICA 2014 June 3 38 / 38