sequential change-point detection based on nearest neighbors

71
Sequential Change-Point Detection Based on Nearest Neighbors Hao Chen Department of Statistics University of California, Davis February, 2018 *This work is partially supported by NSF-DMS 1513653.

Upload: others

Post on 04-Oct-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sequential Change-Point Detection Based on Nearest Neighbors

Sequential Change-Point Detection Basedon Nearest Neighbors

Hao Chen

Department of StatisticsUniversity of California, Davis

February, 2018

*This work is partially supported by NSF-DMS 1513653.

Page 2: Sequential Change-Point Detection Based on Nearest Neighbors

Control chart

How about

monitor multiple streams?

monitor non-Euclidean data?

Page 3: Sequential Change-Point Detection Based on Nearest Neighbors

Control chart

How about

monitor multiple streams?

monitor non-Euclidean data?

Page 4: Sequential Change-Point Detection Based on Nearest Neighbors

Modern data examples

fMRI:

Social networks:

. . .

Page 5: Sequential Change-Point Detection Based on Nearest Neighbors

Outline

1 Graph-based two-sample test

2 Offline change-point detection

3 Sequential (Online) change-point detection

4 An application

Page 6: Sequential Change-Point Detection Based on Nearest Neighbors

Outline

1 Graph-based two-sample test

2 Offline change-point detection

3 Sequential (Online) change-point detection

4 An application

Page 7: Sequential Change-Point Detection Based on Nearest Neighbors

Graph-based two-sample test

Assume we already have a similarity measure on the sample space.

Two samples from the same distribution:

# of NNs from the other sample: 27

Page 8: Sequential Change-Point Detection Based on Nearest Neighbors

Graph-based two-sample test

Assume we already have a similarity measure on the sample space.

Two samples from the same distribution:

# of NNs from the other sample: 27

Page 9: Sequential Change-Point Detection Based on Nearest Neighbors

Graph-based two-sample test

Assume we already have a similarity measure on the sample space.

Two samples from the same distribution:

# of NNs from the other sample: 27

Page 10: Sequential Change-Point Detection Based on Nearest Neighbors

Graph-based two-sample test

Assume we already have a similarity measure on the sample space.

Two samples from the same distribution:

# of NNs from the other sample: 27

Page 11: Sequential Change-Point Detection Based on Nearest Neighbors

Two-sample test based on nearest neighbors

Two samples from different distributions:

# of NNs from the other sample: 8

Page 12: Sequential Change-Point Detection Based on Nearest Neighbors

Two-sample test based on nearest neighbors

Two samples from different distributions:

# of NNs from the other sample: 8

Page 13: Sequential Change-Point Detection Based on Nearest Neighbors

Two-sample test based on nearest neighbors

Two samples from different distributions:

# of NNs from the other sample: 8

Page 14: Sequential Change-Point Detection Based on Nearest Neighbors

Two-sample test based on k-nearest neighbors

y1, . . . ,yn be the pooled observations of two samples.

gi =

{1 if yi belongs to sample 1,0 if yi belongs to sample 2.

a(r)ij =

{1 if yj is the rth nearest neighbor of yi,0 otherwise.

a+ij =∑k

r=1 a(r)ij .

# of nearest neighbors from the other sample:

X =1

2

n∑i=1

n∑j=1

(a+ij + a+ji)I(gi 6= gj)

[Schilling, 1986; Henze, 1988]

Page 15: Sequential Change-Point Detection Based on Nearest Neighbors

Outline

1 Graph-based two-sample test

2 Offline change-point detection

3 Sequential (Online) change-point detection

4 An application

Page 16: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 17: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 18: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 19: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 20: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 21: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 22: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 23: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 24: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 25: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 26: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 27: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 28: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 29: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 30: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 31: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 32: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 33: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 34: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 35: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 36: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 37: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 38: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 39: Sequential Change-Point Detection Based on Nearest Neighbors

Offline change-point detection based on NNs

ObservationSequence:

Page 40: Sequential Change-Point Detection Based on Nearest Neighbors

Offline Change-point detection based on NNs

# of NN from the other sample:

Page 41: Sequential Change-Point Detection Based on Nearest Neighbors

Standardize the count

X(t) =1

2

n∑i=1

n∑j=1

(a+ij + a+ji)I(gi(t) 6= gj(t)), gi(t) = I(i > t).

Expectation and variance under permutation null distribution:

E(X(t)) =2kt(n− t)n− 1

,

Var(X(t)) =t(n− t)n− 1

(h(t, n− t)

(q1,n + k − 2k2

n− 1

)+(1− h(t, n− t))

(q2,n + k − k2

)),

where h(t, n− t) =4t(n− t)

(n− 2)(n− 3), q1,n =

1

n

∑i,j

a+ija+ji, q2,n =

1

n

∑i 6=j;l

a+ila+jl.

Standardized count:

Z(t) = −X(t)− E(X(t))√Var(X(t))

Page 42: Sequential Change-Point Detection Based on Nearest Neighbors

Standardized count

Page 43: Sequential Change-Point Detection Based on Nearest Neighbors

In contrast: no change-point

Test statistic: maxn0≤t≤n−n0

Z(t)

Page 44: Sequential Change-Point Detection Based on Nearest Neighbors

In contrast: no change-point

Test statistic: maxn0≤t≤n−n0

Z(t)

Page 45: Sequential Change-Point Detection Based on Nearest Neighbors

Outline

1 Graph-based two-sample test

2 Offline change-point detection

3 Sequential (Online) change-point detection

4 An application

Page 46: Sequential Change-Point Detection Based on Nearest Neighbors

Online change-point detection based on NNs

N0 historical observations: y1, . . . ,yN0

subsequent observations: yN0+1,yN0+2, . . . ,yn, . . .

Zn(t): standardized count for the sequence y1, . . . ,yn.

maxn0≤t≤n−n0

Zn(t)

Page 47: Sequential Change-Point Detection Based on Nearest Neighbors

Online change-point detection based on NNs

N0 historical observations: y1, . . . ,yN0

subsequent observations: yN0+1,yN0+2, . . . ,yn, . . .

Zn(t): standardized count for the sequence y1, . . . ,yn.

maxn0≤t≤n−n0

Zn(t)

Page 48: Sequential Change-Point Detection Based on Nearest Neighbors

Stopping Time

T1 = inf

{n−N0 : max

n0≤t≤n−n0

Zn(t) > b1

}

T2 = inf

{n−N0 : max

n−n1≤t≤n−n0

Zn(t) > b2

}

T3 = inf

{n−N0 : max

n−n1≤t≤n−n0

ZnL(t) > b3

},

ZnL(t): standardized count for observations yn−L+1, . . . ,yn.

Page 49: Sequential Change-Point Detection Based on Nearest Neighbors

Stopping Time

T1 = inf

{n−N0 : max

n0≤t≤n−n0

Zn(t) > b1

}

T2 = inf

{n−N0 : max

n−n1≤t≤n−n0

Zn(t) > b2

}

T3 = inf

{n−N0 : max

n−n1≤t≤n−n0

ZnL(t) > b3

},

ZnL(t): standardized count for observations yn−L+1, . . . ,yn.

Page 50: Sequential Change-Point Detection Based on Nearest Neighbors

Stopping Time

T1 = inf

{n−N0 : max

n0≤t≤n−n0

Zn(t) > b1

}

T2 = inf

{n−N0 : max

n−n1≤t≤n−n0

Zn(t) > b2

}

T3 = inf

{n−N0 : max

n−n1≤t≤n−n0

ZnL(t) > b3

},

ZnL(t): standardized count for observations yn−L+1, . . . ,yn.

Page 51: Sequential Change-Point Detection Based on Nearest Neighbors

Detection Delay

Average run length: E∞(T ).Expected detection delay: Er(N − r|N > r).

Threshold b selected subject to P∞(T < 1, 000) = 0.05.

r −N0 = 200.

Page 52: Sequential Change-Point Detection Based on Nearest Neighbors

Detection Delay

Average run length: E∞(T ).Expected detection delay: Er(N − r|N > r).

Threshold b selected subject to P∞(T < 1, 000) = 0.05.

r −N0 = 200.

Page 53: Sequential Change-Point Detection Based on Nearest Neighbors

Early stops (False discovery)

Threshold b selected subject to P∞(T < 1, 000) = 0.05.

False discovery rate at 200 new observations after the startingof the test:

T1 T2 T3

1-NN 0.0178 0.0205 0.0107

3-NN 0.0148 0.0183 0.0103

Page 54: Sequential Change-Point Detection Based on Nearest Neighbors

Average run length

T = inf

{n : max

n−n1≤t≤n−n0

ZnL(t) > b

},

E∞(Tb) = 10, 000 ⇒ b =?

Page 55: Sequential Change-Point Detection Based on Nearest Neighbors

Average run length

T = inf

{n : max

n−n1≤t≤n−n0

ZnL(t) > b

},

E∞(Tb) = 10, 000 ⇒ b =?

Page 56: Sequential Change-Point Detection Based on Nearest Neighbors

Average run length

T = inf

{n : max

n−n1≤t≤n−n0

ZnL(t) > b

},

Theorem

Suppose L, b, n0, n1 →∞ in such a way that b = c√L, n0 = u0L and

n1 = u1L for some fixed 0 < c <∞, 0 < u0 < u1 < 1. When there is nochange point, T is asymptotically exponentially distributed with expectation

E∞(Tb) ∼√2π exp(b2/2)

c2 b∫ u1

u0h1(u)h2(u)ν

(c√

2h1(u))ν(c√

2h2(u))du,

where

h1(u) =[16u(1− u)(k + pk,∞) + 2(1− 2u)2(qk,∞ − k2 + k)

]/σ2(u),

h2(u) =[16u2(1− u)2(pk,∞ + qk,∞ + k2 + 2p

(k)k,∞ − 2q

(k)k,∞)

+ 4u(1− u)(2q(k)k,∞ − 3qk,∞ + k2 + k) + 2(qk,∞ − k2 + k)]/σ2(u),

σ(u) = 4u(1− u)(4u(1− u)(k + pk,∞) + (1− 2u)2(qk,∞ − k2 + k)).

Page 57: Sequential Change-Point Detection Based on Nearest Neighbors

Mutual NN and Shared NN

Mutual NN:

pk,∞ = limn→∞

E

1

n

∑j

a+n,ija+n,ji

, p(k)k,∞ = lim

n→∞E

1

n

∑j

a+n,ija(k)n,ji

Shared NN:

qk,∞ = limn→∞

E

1

n

∑j 6=l

a+n,jia+n,li

, q(k)k,∞ = lim

n→∞E

1

n

∑j 6=l

a+n,jia(k)n,li

For multivariate data and under Euclidean distance, pk,∞, qk,∞,

p(k)k,∞, q

(k)k,∞ can be expressed as analytic functions of the

dimension of the data.

In practice, it is better to use pk,L, qk,L, p(k)k,L, q

(k)k,Lestimated from

the data.

Page 58: Sequential Change-Point Detection Based on Nearest Neighbors

Mutual NN and Shared NN

Mutual NN:

pk,∞ = limn→∞

E

1

n

∑j

a+n,ija+n,ji

, p(k)k,∞ = lim

n→∞E

1

n

∑j

a+n,ija(k)n,ji

Shared NN:

qk,∞ = limn→∞

E

1

n

∑j 6=l

a+n,jia+n,li

, q(k)k,∞ = lim

n→∞E

1

n

∑j 6=l

a+n,jia(k)n,li

For multivariate data and under Euclidean distance, pk,∞, qk,∞,

p(k)k,∞, q

(k)k,∞ can be expressed as analytic functions of the

dimension of the data.

In practice, it is better to use pk,L, qk,L, p(k)k,L, q

(k)k,Lestimated from

the data.

Page 59: Sequential Change-Point Detection Based on Nearest Neighbors

How does the asymptotic result work for finite L?

L = 200.

Check the threshold b such that E∞(T ) = 10, 000.

Multivariate Gaussian data.

n0 = 3 n0 = 10Monte Asymp. Monte Asymp.Carlo1 Carlo

d = 10k = 1 4.04 4.40 4.04 4.31k = 3 4.14 4.34 4.14 4.23

d = 100k = 1 3.76 4.37 3.76 4.26k = 3 3.78 4.33 3.78 4.20

110,000 simulation runs.

Page 60: Sequential Change-Point Detection Based on Nearest Neighbors

Skewness correction

E∞(T3) ∼√2π exp(b2/2)

c2 b∫ u1

u0S(u)h1(u)h2(u)ν

(c√

2h1(u))ν(c√

2h2(u))du

S(u) depends on the probabilities of the following events:

Page 61: Sequential Change-Point Detection Based on Nearest Neighbors

Skewness correction

E∞(T3) ∼√2π exp(b2/2)

c2 b∫ u1

u0S(u)h1(u)h2(u)ν

(c√

2h1(u))ν(c√

2h2(u))du

S(u) depends on the probabilities of the following events:

Page 62: Sequential Change-Point Detection Based on Nearest Neighbors

Skewness Correction

Check the threshold b such that E∞(T ) = 10, 000.

n0 = 3 n0 = 10Monte Skewness Asymp. Monte Skewness Asymp.Carlo Corrected Carlo Corrected

d = 10

k = 1 4.04 4.07 4.40 4.04 4.07 4.31k = 3 4.14 4.14 4.34 4.14 4.14 4.23

d = 100

k = 1 3.76 3.79 4.37 3.76 3.79 4.26k = 3 3.78 3.79 4.33 3.78 3.79 4.20

Page 63: Sequential Change-Point Detection Based on Nearest Neighbors

Power assessment

Percentage of trials (out of 1,000) that the method successfullydetects the change-point.

“Successful detection”: Detect the change-point within 100observations after it occurs.

Normal data Lognormal datad = 10 d = 100 d = 10 d = 100

∆ = 0.7 ∆ = 1.8 ∆ = 1.5 ∆ = 2

1-NN 0.02 0.21 0.48 0.08

3-NN 0.07 0.55 0.87 0.48

5-NN 0.15 0.81 0.95 0.77

Hotelling’s T 2 0.69 0.63 0.34 0.02

∆: change in the mean parameter.

Page 64: Sequential Change-Point Detection Based on Nearest Neighbors

Outline

1 Graph-based two-sample test

2 Offline change-point detection

3 Sequential (Online) change-point detection

4 An application

Page 65: Sequential Change-Point Detection Based on Nearest Neighbors

Is there a change in phone call pattern?

Mobile phone datacollected by MIT medialab

87 students and faculty

7/20/2004 – 6/14/2005

Mt: adjacency matrix for day t, 1 for element [i, j] if subject icalled subject j on day t.

We consider two distances:

The number of different entries: ‖Mt1 −Mt2‖2F .

The number of different entries, normalized by thegeometric mean of the total edges in each day:‖Mt1−Mt2‖

2F

‖Mt1‖F ‖Mt2‖F.

Page 66: Sequential Change-Point Detection Based on Nearest Neighbors

Is there a change in phone call pattern?

Mobile phone datacollected by MIT medialab

87 students and faculty

7/20/2004 – 6/14/2005

Mt: adjacency matrix for day t, 1 for element [i, j] if subject icalled subject j on day t.

We consider two distances:

The number of different entries: ‖Mt1 −Mt2‖2F .

The number of different entries, normalized by thegeometric mean of the total edges in each day:‖Mt1−Mt2‖

2F

‖Mt1‖F ‖Mt2‖F.

Page 67: Sequential Change-Point Detection Based on Nearest Neighbors

Is there a change in phone call pattern?

Mobile phone datacollected by MIT medialab

87 students and faculty

7/20/2004 – 6/14/2005

Mt: adjacency matrix for day t, 1 for element [i, j] if subject icalled subject j on day t.

We consider two distances:

The number of different entries: ‖Mt1 −Mt2‖2F .

The number of different entries, normalized by thegeometric mean of the total edges in each day:‖Mt1−Mt2‖

2F

‖Mt1‖F ‖Mt2‖F.

Page 68: Sequential Change-Point Detection Based on Nearest Neighbors

Phone-call network

Page 69: Sequential Change-Point Detection Based on Nearest Neighbors

Stopping times and nearby academic events

Distance 1 Distance 2 Nearby academic event*n = 66: n = 60:

2004/9/23 2004/9/17 9/9: First day of class for Fall termn = 166: n = 140:2005/1/1 2004/12/6 12/18: last day of class for Fall termn = 198: n = 194:2005/2/2 2005/1/29 2/2: First day of class for Spring term

n = 252:— 2005/3/28 3/21: Spring vacation

* The dates of the academic events are from the 2015-2016 academiccalendar of MIT as the 2004-2005 academic calendar of MIT cannot befound online.

Page 70: Sequential Change-Point Detection Based on Nearest Neighbors

Summary

Sequential change-point detection based on nearestneighbors can be applied to multivariate data andnon-Euclidean data as long as a similarity measure on thesample space can be well defined.

The stopping time based on the recent observations isrecommended. Its asymptotic distribution is derived andshown to be quite accurate for finite scenarios afterskewness correction. This makes the method aneasy-off-the-shelf approach to real problems.

Thank You!

Page 71: Sequential Change-Point Detection Based on Nearest Neighbors

Summary

Sequential change-point detection based on nearestneighbors can be applied to multivariate data andnon-Euclidean data as long as a similarity measure on thesample space can be well defined.

The stopping time based on the recent observations isrecommended. Its asymptotic distribution is derived andshown to be quite accurate for finite scenarios afterskewness correction. This makes the method aneasy-off-the-shelf approach to real problems.

Thank You!