exact recovery of low-rank plus compressed sparse matrices

18
1 Exact Recovery of Low-Rank Plus Compressed Sparse Matrices Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant Ann Arbor, USA August 6, 2012

Upload: briar

Post on 07-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Exact Recovery of Low-Rank Plus Compressed Sparse Matrices. Morteza Mardani , Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota Acknowledgments : MURI (AFOSR FA9550-10-1-0567) grant. Ann Arbor, USA August 6, 2012. 1. Context. Occam’s razor . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

1

Exact Recovery of Low-Rank Plus Compressed Sparse Matrices

Morteza Mardani, Gonzalo Mateos and Georgios Giannakis

ECE Department, University of Minnesota

Acknowledgments: MURI (AFOSR FA9550-10-1-0567) grant

Ann Arbor, USAAugust 6, 2012

Page 2: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

Context

2

Network cartography

0 100 200 300 400 5000

1

2

3

4x 10

7

Time index (t)

|xf,

t|

Image processing

Among competing hypotheses, pick one which makes the fewest assumptions and thereby offers the simplest explanation of the effect.

Occam’s razor

Page 3: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

33

Objective

Goal: Given data matrix and compression matrix , identify sparse

L

T F

and low rank .

Page 4: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

4

Challenges and importance

Important special cases

R = I : matrix decomposition [Candes et al’11], [Chandrasekaran et al’11]

X = 0 : compressive sampling [Candes et al’05]

A = 0: (with noise) PCA [Pearson’1901]

Seriously underdetermined LT + FT >> LT

X A Y

Both rank( ) and support of generally unknown

(F ≥ L)

Page 5: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

55

Unveiling traffic anomalies

Backbone of IP networks

Traffic anomalies: changes in origin-destination

(OD) flows

Anomalies congestion limits end-user QoS provisioning

Measuring superimposed OD flows per link, identify anomalies

Failures, transient congestions, DoS attacks, intrusions, flooding

Page 6: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

66

Model Graph G (N, L) with N nodes, L links, and F flows (F=O(N2) >> L)

(as) Single-path per OD flow zf,t

є 0,1

Anomaly

LxT LxF

Packet counts per link l and time slot t

Matrix model across T time slots

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

f1

f2

l

Page 7: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

7

Low rank and sparsity

Z: traffic matrix is low-rank [Lakhina et al‘04] X: low-rank

A: anomaly matrix is sparse across both time and flows

0 100 200 300 400 5000

1

2

3

4x 10

7

Time index (t)

|xf,

t|

0 200 400 600 800 10000

2

4x 10

8

Time index(t)

|af,

t|

0 50 1000

2

4x 10

8

Flow index(f)

|af,

t|

Page 8: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

8

Criterion

(P1)

Low-rank sparse vector of SVs nuclear norm || ||* and l1 norm

Q: Can one recover sparse and low-rank exactly?

A: Yes! Under certain conditions on

Page 9: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

9

Identifiability

Sparsity-preserving matrices RH

For and r = rank(X0), low-rank-preserving matrices RH

Y = X0+ RA0 = X0 + RH + R(A0 - H)

X1 A1

,

Problematic cases but low-rank and sparse

Page 10: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

10

Incoherence measures

Incoherence between X0 and R

ΩR

θ=cos-1(μ)

Ф

Incoherence among columns of R ,

Exact recovery requires ,

Local identifiability requires

,

Page 11: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

11

Main result

Theorem: Given and , if every row and column of has at most k non-zero entries and , then

imply Ǝ for which (P1) exactly recovers

M. Mardani, G. Mateos, and G. B. Giannakis,``Recovery of low-rank plus compressed sparse matrices with application to unveiling traffic anomalies," IEEE Trans. Info. Theory, submitted Apr. 2012.(arXiv: 1204.6537)

Page 12: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

12

Intuition

Exact recovery if

r and s are sufficiently small

Nonzero entries of A0 are “sufficiently spread out”

Columns (rows) of X0 not aligned with basis of R (canonical basis)

R behaves like a “restricted” isometry

Interestingly Amplitude of non-zero entries of A0 irrelevant

No randomness assumption

Satisfiability for certain random ensembles w.h.p

Page 13: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

13

Validating exact recovery

Setup

L=105, F=210, T = 420

R=URSRV’R~ Bernoulli(1/2)

X = V’R WZ’, W, Z ~ N(0, 104/FT)

aij ϵ -1,0,1 w. prob. ρ/2, 1-ρ, ρ/2

Relative recovery error

% non-zero entries ()

rank

(XR

) (r

)

0.1 2.5 4.5 6.5 8.5 10.5 12.5

10

20

30

40

50

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Page 14: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

1414

Real data Abilene network data

Dec. 8-28, 2008 N=11, L=41, F=121, T=504

0100

200300

400500

0

50

100

0

1

2

3

4

5

6

Time

Pf = 0.03Pd = 0.92Qe = 27%

---- True---- Estimated

Data: http://internet2.edu/observatory/archive/data-collections.html

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

False alarm probability

Det

ecti

on p

rob

abili

ty

[Lakhina04], rank=1[Lakhina04], rank=2[Lakhina04], rank=3Proposed method[Zhang05], rank=1[Zhang05], rank=2[Zhang05], rank=3

Flow

Anom

aly

am

plit

ude

Page 15: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

1515

Synthetic data Random network topology

N=20, L=108, F=360, T=760 Minimum hop-count routing

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

False alarm probability

Det

ecti

on p

roba

bili

ty

PCA-based method, r=5PCA-based method, r=7PCA-based method, r=9Proposed method, per time and flow

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Pf=10-4

Pd = 0.97

---- True---- Estimated

Page 16: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

1616

Distributed estimator

Network: undirected, connected graph

Challenges Nuclear norm is not separable Global optimization variable A

?

?

??

?

?

?

?

n

Key result [Recht et al’11]

M. Mardani, G. Mateos, and G. B. Giannakis, "In-network sparsity-regularized rank minimization: Algorithms and applications," IEEE Trans. Signal Process., submitted Feb. 2012.(arXiv: 1203.1570)

Centralized (P2)

Page 17: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

17

(P3)Consensus with

neighboring nodes

Consensus and optimality

Alternating-directions method of multipliers (ADMM) solver

Highly parallelizable with simple recursions

Claim: Upon convergence attains the global optimum of (P2)

Low overhead for message exchanges n

Page 18: Exact  Recovery of Low-Rank Plus Compressed  Sparse Matrices

18

Online estimator

Claim: Convergence to stationary point set of the batch estimator

M. Mardani, G. Mateos, and G. B. Giannakis, "Dynamic anomalography: Tracking network anomalies via sparsity and low rank," IEEE Journal of Selected Topics in Signal Processing, submitted Jul. 2012.

(P4)

---- estimated---- real

o---- estimated ---- real

0

2

4CHIN--ATLA

0

20

40

An

om

aly

am

pli

tud

e

WASH--STTL

0 1000 2000 3000 4000 5000 60000

10

20

30

Time index (t)

WASH--WASH

0

5ATLA--HSTN

0

10

20

Lin

k t

ra

ffic

lev

el

DNVR--KSCY

0

10

20

Time index (t)

HSTN--ATLA

Streaming data: