measuring and exploiting data-model fit with sheaves · 5/24/2018  · michael robinson key points...

Post on 19-Jul-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Michael Robinson

Measuring and Exploiting Data-Model Fit

with Sheaves

  Michael Robinson

Acknowledgements● Collaborators:

– Cliff Joslyn, Katy Nowak, Brenda Praggastis, Emilie Purvine (PNNL)– Chris Capraro, Grant Clarke, Janelle Henrich (SRC)– Jason Summers, Charlie Gaumond (ARiA)– J Smart, Dave Bridgeland (Georgetown)

● Students:– Eyerusalem Abebe– Olivia Chen– Philip Coyle– Brian DiZio

● Recent funding: DARPA, ONR, AFRL

– Jen Dumiak– Samara Fantie– Sean Fennell– Robby Green

– Noah Kim– Fangfei Lan– Metin Toksoz-Exley– Jackson Williams– Greg Young

  Michael Robinson

The middle way

Statistics

Topology

Geometry

  Michael Robinson

Key points● Systems can be encoded as sheaves● Datasets are assignments to a sheaf model of a system● Consistency radius measures compatibility between

system and dataset● Statistical inference minimizes consistency radius

– Regression (parameter estimation, learning, model selection) varies the sheaf

– Data fusion (prediction) varies the assignment● Topological properties of the minimization problem

may aid model selection and/or data cleaning

  Michael Robinson

Process flow diagramSystem model

Sheaf

Observations

Assignment

Consistency radiusPaired (Sheaf,Assignment)

Regression

Data fusion

“Syntax”

“Semantics”

  Michael Robinson

CategoriallySystem model

Shv

Observations

Pseud

Consistency radiusShvFPA

Regression

Data fusion

“Syntax”

“Semantics”

(Faithful)

(Faithful)

(Forgetful)

(Forgetful)

  Michael Robinson

Geometrically/topologicallySystem model

Shv

Observations

Pseud

ℝShvFPA

Regression

Data fusion

“Syntax”

“Semantics”

(Faithful)

(Faithful)

(Continuous)

(Continuous)

(Continuous)

(Note: all of this within an isomorphism class of sheaves)

Michael Robinson

What is a sheaf?

A sheaf of _____________ on a ______________(data type) (topological space)

Michael Robinson

What is a sheaf?

A sheaf of _____________ on a ______________(data type) (topological space)

pseudometric spaces partial order

  Michael Robinson

Topologizing a partial order

Open sets are unionsof up-sets

  Michael Robinson

Topologizing a partial order

Intersectionsof up-sets are alsoup-sets

  Michael Robinson

( )( )

A sheaf on a poset is...

A set assigned to each element, calleda stalk, and …

ℝ2 ℝ2

ℝ3

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

ℝ ℝ2

ℝ2

(1 -1) ( )0 11 0

( )-3 3-4 4

This is a sheaf of vector spaces on a partial order

ℝ3

( )0 1 11 0 1

(-2 1)

(The stalk on an element in the posetis better thought of beingassociated to the up-set)

231

2 -23 -31 -1

  Michael Robinson

( )( )

A sheaf on a poset is...

… restriction functions between stalks, following the order relation…

ℝ2 ℝ2

ℝ3

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

ℝ ℝ2

ℝ2

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1

This is a sheaf of vector spaces on a partial order

ℝ3

( )0 1 11 0 1

(-2 1)

(“Restriction” because it goes frombigger up-sets to smaller ones)

  Michael Robinson

( )( )

A sheaf on a poset is...ℝ

ℝ2 ℝ2

ℝ3

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

ℝ ℝ2

ℝ2

(1 -1) ( )0 11 0

( )-3 3-4 4

This is a sheaf of vector spaces on a partial order

ℝ3

( )0 1 11 0 1

(-2 1)

(1 -1) =

( )0 1 11 0 1(1 0) = (0 1) ( )1 0 1

0 1 1

( )0 11 0( )-3 3

-4 4( )1 0 10 1 1

2 -23 -31 -1

=

… so that the diagramcommutes!

231

231

2 -23 -31 -1

2 -23 -31 -1

  Michael Robinson

( )( )

An assignment is...

… the selection of avalue on some open sets

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(-1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

230

-2-3-1

The term serration is more common, but perhaps more opaque.

  Michael Robinson

( )( )

A global section is...

… an assignmentthat is consistent with the restrictions

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(-1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

230

-2-3-1

  Michael Robinson

( )( )

Some assignments aren’t consistent

… but they mightbe partially consistent

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(+1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

231

-2-3-1

  Michael Robinson

( )( )

Consistency radius is...… the maximum (or some other norm)distance between the value in a stalk and the values propagated along the restrictions

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(+1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

231

-2-3-1

231( )0 1 1

1 0 1 ( )32- = 2

( )23(1 -1) - 1 = 2

(+1) - 231

-2-3 = 2 14-1 MAX ≥ 2 14

Note: lots more restrictions to check!

  Michael Robinson

( )( )

Consistency radius is monotonic

Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(+1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

231

-2-3-1

Note: lots more restrictions to check!

ConsistencyRadius = 0

  Michael Robinson

( )( )

Consistency radius is monotonicProposition: If U ⊆ V then c(U) ≤ c(V)

0 1 11 0 1

1 0 10 1 1

(1 0)(0 1)

(1 -1)

231

( )0 11 0

( )-3 3-4 4

2 -23 -31 -1 ( )0 1 1

1 0 1

(-2 1)

(+1)

( )23

( )-4-3

( )-3-4

( )32

(-4)

(-4)

231

-2-3-1

Note: lots more restrictions to check!

ConsistencyRadius = 2 14

Consistencyradius restricted toan open set U

  Michael Robinson

Consistency radius optimization

ℝ ℝ

ℝ(2)

(1)

(1)

(1)

(½)

This is a sheaf on a small poset

A B

C

  Michael Robinson

Consistency radius optimization

0 1

?

?(2)

(1)

(1)

(1)

(½)

Here is an assignment supported on part of it

A B

C

  Michael Robinson

Consistency radius optimization

0 1

?(2)

(1)

(1)

(1)

(½)½

Minimizing the consistency radius when extending globally

A B

C

  Michael Robinson

Consistency radius optimization

?(2)

(1)

(1)

(1)

(½)⅓

Here is the closest global section (everything can be changed)

⅔A B

C

  Michael Robinson

Consistency radius is not a measure

ℝ ℝ

ℝ(2)

(1)

(1)

(1)

(½)

This is the full sheaf diagram including all open sets in theAlexandrov topology, not just the base

{A,C} {B,C}

{C}

{A,B,C}

  Michael Robinson

Consistency radius is not a measure

0 1

?

?(2)

(1)

(1)

(1)

(½)

{A,C} {B,C}

{C}

{A,B,C}

  Michael Robinson

Consistency radius is not a measure

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

Minimizing the consistency radius when extendingThe value on the intersection is no longer unique!

This value can be anything between ⅓ and ⅔

{A,C} {B,C}

{C}

{A,B,C}

  Michael Robinson

Consistency radius is not a measure

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

Consistency radius of this open set = 0

Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U

{A,C} {B,C}

{C}

{A,B,C}

  Michael Robinson

Consistency radius is not a measure

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

c(U) = ½

Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U

  Michael Robinson

Consistency radius is not a measure

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

c(U) = ½ c(V) = ½

Lemma: Consistency radius on an open set U is computed by only considering open sets V1 ⊆ V2 ⊆ U

  Michael Robinson

Consistency radius is not a measure

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

c(U) = ½ c(V) = ½

c(U ∩ V) = 0

c(U ∪ V) = ⅔ ≠ c(U) + c(V) – c(U ∩ V)

  Michael Robinson

The consistency filtration

1

(1)

(1)

Consistencythreshold 0 ½ ⅔

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

Consistency radius = ⅔

● … assigns the set of open sets with consistency less than a given threshold

● Lemma: consistency filtration is a sheaf of collections of open sets on (ℝ,≤). Restrictions in this sheaf are coarsenings.

coarsen coarsen

  Michael Robinson

The consistency filtration

1

(1)

(1)

0 ½ ⅔

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

Consistency radius = ⅔

● Theorem: Consistency filtration is a functor

CF : ShvFPA→ Shv(ℝ,≤)● As a practical matter, CF transforms an assignment into

persistent Čech cohomology

Consistencythreshold

coarsen coarsen

  Michael Robinson

The consistency filtration

1

(1)

(1)

0 ½ ⅔

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

0 1

½

⅓(2)

(1)

(1)

(1)

(½)

Consistency radius = ⅔

● Theorem: Consistency filtration is continuous under an appropriate interleaving distance, so– Generators of PH0(CF) with are highly consistent subsets– Generators of PHk(CF) for k > 0 are robust obstructions to

consistency

Consistencythreshold

coarsen coarsen

  Michael Robinson

Geometrically/topologicallySystem model

Shv

Observations

Pseud

ℝShvFPA

Regression

Data fusion

“Syntax”

“Semantics”

(Faithful)

(Faithful)

(Continuous)

(Continuous)

(Continuous)

(Note: all of this within an isomorphism class of sheaves)

  Michael Robinson

Inference and consistencyStatistical inference minimizes various risks:

● Intrinsic risk– Noise in the data

● Parameter risk– Errors in estimation

● Model risk– Using the wrong

model

Minimize consistency radius by:

Varying the assignment

Varying the sheaf isomorphism class

Varying the sheaf within an isomorphism class

Michael Robinson

Chris Capraro, Janelle Henrich

Separating sinusoids from noise● Consider a signal formed from N sinusoids

– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a

● Task: Recover these parameters from M samples f is an arbitrary

known functionPossibilities:● Magnitude● Phase● Identity function● Quantizer output● Signal dispersion

Gaussian noiseSample time

Sinusoid frequency

Sinusoid amplitude

Michael Robinson

Chris Capraro, Janelle Henrich

Separating sinusoids from noise● Consider a signal formed from N sinusoids

– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a

● Model the situation as a sheaf over a poset... 

ℂ ℂ ℂ…

Parameter spaces become stalks

Signal models become restrictions

ℝN×ℂN

Michael Robinson

Chris Capraro, Janelle Henrich

Separating sinusoids from noise● Consider a signal formed from N sinusoids

– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a

● The samples become an assignment to part of the sheaf  

ℝN×ℂN

x1 ∈ ℂ x2 ∈ ℂ xM ∈ ℂ Observed…

Michael Robinson

Chris Capraro, Janelle Henrich

Separating sinusoids from noise● Consider a signal formed from N sinusoids

– Each sinusoid has a (real) frequency ω– Each sinusoid has a (complex) amplitude a

● Find the unknown parameters by minimizing consistency radius

(ω1,…,ωN,a1,…,aN) ∈ ℝN×ℂN

x1 ∈ ℂ x2 ∈ ℂ xM ∈ ℂ Observed

Inferred

 

Michael Robinson

Chris Capraro, Janelle Henrich

Sheaves deliver excellent performance

8x improvement over state-of-the-art in heavy noise!

Sheaf result

  Michael Robinson

The future● Preprint: https://arxiv.org/abs/1805.08927● Computational sheaf theory

– Small examples can be put together ad hoc– Larger ones require a software library

● PySheaf: a software library for sheaves– https://github.com/kb1dds/pysheaf– Includes several examples you can play with!

● Connections to statistical models need to be explored– Is consistency radius related to likelihood functions?

● Extensive testing on various datasets and scenarios

top related