Download - Exact Recovery of Low-Rank Plus Compressed Sparse Matrices

1

Exact Recovery of Low-Rank Plus Compressed Sparse Matrices

Morteza Mardani, Gonzalo Mateos and Georgios Giannakis

ECE Department, University of Minnesota

Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant

Ann Arbor, USAAugust 6, 2012

Context

2

Network cartography

0 100 200 300 400 5000

1

2

3

4x 10

7

Time index (t)

|x f,t|

Image processing

Among competing hypotheses, pick one which makes the fewest assumptions and thereby offers the simplest explanation of the effect.

Occam’s razor

33

Objective

Goal: Given data matrix and compression matrix , identify sparse

L

T F

and low rank .

4

Challenges and importance

Important special cases

R = I : matrix decomposition [Candes et al’11], [Chandrasekaran et al’11] X = 0 : compressive sampling [Candes et al’05] A = 0: (with noise) PCA [Pearson’1901]

Seriously underdetermined LT + FT >> LT

X A Y

Both rank( ) and support of generally unknown

(F ≥ L)

55

Unveiling traffic anomalies

Backbone of IP networks

Traffic anomalies: changes in origin-destination (OD) flows

Anomalies congestion limits end-user QoS provisioning

Measuring superimposed OD flows per link, identify anomalies

Failures, transient congestions, DoS attacks, intrusions, flooding

66

Model Graph G (N, L) with N nodes, L links, and F flows (F=O(N2) >> L)

(as) Single-path per OD flow zf,t

є 0,1

Anomaly

LxT LxF

Packet counts per link l and time slot t

Matrix model across T time slots

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

f1

f2

l

7

Low rank and sparsity

Z: traffic matrix is low-rank [Lakhina et al‘04] X: low-rank

A: anomaly matrix is sparse across both time and flows

0 100 200 300 400 5000

1

2

3

4x 10

7

Time index (t)

|x f,t|

0 200 400 600 800 10000

2

4x 10

8

Time index(t)

|af,t

|

0 50 1000

2

4x 10

8

Flow index(f)

|af,t

|

8

Criterion

(P1)

Low-rank sparse vector of SVs nuclear norm || ||* and l1 norm

Q: Can one recover sparse and low-rank exactly?

A: Yes! Under certain conditions on

9

Identifiability

Sparsity-preserving matrices RH

For and r = rank(X0), low-rank-preserving matrices RH

Y = X0+ RA0 = X0 + RH + R(A0 - H)X1 A1

,

Problematic cases but low-rank and sparse

10

Incoherence measures

Incoherence between X0 and R

ΩR

θ=cos-1(μ)

Ф Incoherence among columns of R ,

Exact recovery requires ,

Local identifiability requires

,

11

Main result

Theorem: Given and , if every row and column of has at most k non-zero entries and , then

imply Ǝ for which (P1) exactly recovers

M. Mardani, G. Mateos, and G. B. Giannakis,``Recovery of low-rank plus compressed sparse matrices with application to unveiling traffic anomalies," IEEE Trans. Info. Theory, submitted Apr. 2012.(arXiv: 1204.6537)

12

Intuition Exact recovery if

r and s are sufficiently small

Nonzero entries of A0 are “sufficiently spread out”

Columns (rows) of X0 not aligned with basis of R (canonical basis)

R behaves like a “restricted” isometry

Interestingly Amplitude of non-zero entries of A0 irrelevant No randomness assumption

Satisfiability for certain random ensembles w.h.p

13

Validating exact recovery Setup L=105, F=210, T = 420 R=URSRV’

R~ Bernoulli(1/2) X = V’

R WZ’, W, Z ~ N(0, 104/FT)

aij ϵ -1,0,1 w. prob. ρ/2, 1-ρ, ρ/2

Relative recovery error

% non-zero entries ()

rank

(XR

) (r)

0.1 2.5 4.5 6.5 8.5 10.5 12.5

10

20

30

40

50

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1414

Real data Abilene network data

Dec. 8-28, 2008 N=11, L=41, F=121, T=504

0100

200300

400500

0

50

100

0

1

2

3

4

5

6

Time

Pf = 0.03Pd = 0.92Qe = 27%

---- True---- Estimated

Data: http://internet2.edu/observatory/archive/data-collections.html

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

False alarm probability

Det

ectio

n pr

obab

ility

[Lakhina04], rank=1[Lakhina04], rank=2[Lakhina04], rank=3Proposed method[Zhang05], rank=1[Zhang05], rank=2[Zhang05], rank=3

Flow

Anom

aly

ampl

itude

1515

Synthetic data Random network topology

N=20, L=108, F=360, T=760 Minimum hop-count routing

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

False alarm probability

Det

ectio

n pr

obab

ility

PCA-based method, r=5PCA-based method, r=7PCA-based method, r=9Proposed method, per time and flow

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Pf=10-4

Pd = 0.97

---- True---- Estimated

1616

Distributed estimator

Network: undirected, connected graph

Challenges Nuclear norm is not separable Global optimization variable A

?

?

??

?

??

?

n

Key result [Recht et al’11]

M. Mardani, G. Mateos, and G. B. Giannakis, "In-network sparsity-regularized rank minimization: Algorithms and applications," IEEE Trans. Signal Process., submitted Feb. 2012.(arXiv: 1203.1570)

Centralized (P2)

17

(P3)Consensus with

neighboring nodes

Consensus and optimality

Alternating-directions method of multipliers (ADMM) solver

Highly parallelizable with simple recursions

Claim: Upon convergence attains the global optimum of (P2)

Low overhead for message exchanges n

18

Online estimator

Claim: Convergence to stationary point set of the batch estimator

M. Mardani, G. Mateos, and G. B. Giannakis, "Dynamic anomalography: Tracking network anomalies via sparsity and low rank," IEEE Journal of Selected Topics in Signal Processing, submitted Jul. 2012.

(P4)

---- estimated---- real

o---- estimated ---- real

0

2

4CHIN--ATLA

0

20

40

Ano

mal

y am

plitu

de

WASH--STTL

0 1000 2000 3000 4000 5000 60000

10

20

30

Time index (t)

WASH--WASH

0

5ATLA--HSTN

0

10

20

Lin

k tr

affic

leve

l

DNVR--KSCY

0

10

20

Time index (t)

HSTN--ATLA

Streaming data:

Download - Exact Recovery of Low-Rank Plus Compressed Sparse Matrices

Top Related