the future shape of neuroimaging with persistent homology courses...آ  2017-06-19آ  the future...

Download The future shape of neuroimaging with Persistent Homology Courses...آ  2017-06-19آ  The future shape

Post on 25-Jun-2020

6 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • The future shape of neuroimaging with Persistent Homology

    Taking Connectivity to a Skeptical Future: Challenges, Tools and Techniques

    OHBM 2017, Educational Course

    Ben Cassidy

    BIOSTATISTICS

  • Motivations for network Topological Data Analysis

    Local features Global featuresMesoscopic features

    Nodes Links

    Whole network

    Clusters ? ? ? ?

    2BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Motivations for Topological Data Analysis

    • Consistency and stability - We can make consistent conclusions when observing across scales

    • Inference - Escape the problem of selecting an appropriate link strength threshold for weighted functional networks

    • Signal to Noise Ratio - We can separate interesting from non-interesting patterns

    • Feature Engineering

    • Extract principled, unsupervised features from networks

    • Separate basic and complicated patterns within whole-brain networks, more than just nodes and links

    • Extract distributed and possibly overlapping patterns across a network

    • Basic science - Find the `shape' of brain activity

    3BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • TDA

    • Persistent Homology

    • Mapper algorithm

    • Other related areas

    • Graph Signal Processing

    • Clustering

    • Geometry

    • Manifold learning

    • Morphological Signal Processing

    • …

    4BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Topology basics

    5

    • Topology is the study of

    • shape properties that are preserved under continuous deformations

    • qualitative features, e.g. Homology describes the holes at each dimension: 0 = connected components, 1 = loops, 2 = voids, …

    • donuts and coffee mugs

    BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Topology basics

    • Topology is the study of

    • shape properties that are preserved under continuous deformations

    • qualitative features, e.g. Homology describes the holes at each dimension: 0 = connected components, 1 = loops, 2 = voids, …

    • donuts and coffee mugs

    6BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • • Throw away any idea of distances, what do we have left?

    • The underlying shape regardless of (reasonable) sampling choices

    • Features that cannot be trivialised to an arbitrarily small part of the shape

    Topology basics

    7BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Shape in terms of sampling • We can calculate topological invariants

    • e.g. Euler characteristic, = # nodes (vertices) - # links (edges) + # faces - …

    • Here is the brain activity:

    • No matter how we sample the shape of brain activity, = 2

    0-simplex 1-simplex 2-simplex

    8

    Vertices 4 8 6 20 12

    Edges 6 12 12 30 30

    Faces 4 6 8 12 20

    Euler Characteristic 2 2 2 2 2

    BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • • For more complicated shapes, it is more informative to describe shape in terms of invariants

    • EC can be also defined using homology invariants

    • = Betti 0 - Betti 1 + Betti 2 - …

    • Betti 0 = # connected components

    • Betti 1 = # loops

    • Betti 2 = # voids

    • Computational Homology gives us Betti numbers (and a whole lot more)

    • This explains why topological features are ideal for comparing across scales.

    Shape in terms of invariants

    9

    � = 2 � = 0 � = �2 � = �4 BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Homology of a network

    • Which link strength threshold do we pick?

    • Some holes are more important than others

    101IEEE SIGNAL PROCESSING MAGAZINE | March 2016 |

    nonwheeze signal appears as chaotic, and its topology is also nicely captured at a minimal computational cost.

    Sensor networks The application of algebraic topology in sensor networks is very illustrative of its utility. Owing to the difficulties of obtaining location information of the sensors in the field and to the need for additional hardware to compute precise dis- tances between pairs of sensors, it may be prohibitively expensive to obtain geometric information. Interestingly,

    problems like verifying coverage are purely topological in nature, and, as discussed previously, computational topology provides a coordinate-free solution to quantifying the cover- age status as topological information.

    Consider a set of sensors randomly deployed in a region to be monitored. Two problems are often of interest:

    ■ verifying if the region being monitored is actually fully covered and accounted for

    ■ discovering uncovered regions and identifying their sur- rounding nodes.

    FIGURE 10. A way to approximate the coverage area, shown in (b), in sensor networks using a complex constructed from the communication graph. If the communication radius is twice the individual sensor coverage radius, the Rips shadow is a good topological approximation to the coverage area, as shown in (d). (a) Sensors in a plane. (b) Sensor coverage. (c) The Čech complex. (d) The Rips shadow.

    (a) (b) (c) (d)

    FIGURE 9. The wheeze-detection process. In (a), the top row corresponds to normal signals, while the bottom two rows correspond to wheeze signals. (a) Delay embeddings. (b) Various samplings of a wheeze. (c) A triangulation of a wheeze. (d) A triangulation with large edges removed.

    0.6

    0.4

    0.2

    0

    –0.2

    –0.4

    –0.6

    –0.8 –0.5 0 0.5

    0.6

    0.4

    0.2

    0

    –0.2

    –0.4

    –0.6

    –0.8 –0.5 0 0.5

    X (t+ τ)

    X (t ) X (t )

    X (t+ τ)

    (a) (b)

    (c) (d)

    101IEEE SIGNAL PROCESSING MAGAZINE | March 2016 |

    nonwheeze signal appears as chaotic, and its topology is also nicely captured at a minimal computational cost.

    Sensor networks The application of algebraic topology in sensor networks is very illustrative of its utility. Owing to the difficulties of obtaining location information of the sensors in the field and to the need for additional hardware to compute precise dis- tances between pairs of sensors, it may be prohibitively expensive to obtain geometric information. Interestingly,

    problems like verifying coverage are purely topological in nature, and, as discussed previously, computational topology provides a coordinate-free solution to quantifying the cover- age status as topological information.

    Consider a set of sensors randomly deployed in a region to be monitored. Two problems are often of interest:

    ■ verifying if the region being monitored is actually fully covered and accounted for

    ■ discovering uncovered regions and identifying their sur- rounding nodes.

    FIGURE 10. A way to approximate the coverage area, shown in (b), in sensor networks using a complex constructed from the communication graph. If the communication radius is twice the individual sensor coverage radius, the Rips shadow is a good topological approximation to the coverage area, as shown in (d). (a) Sensors in a plane. (b) Sensor coverage. (c) The Čech complex. (d) The Rips shadow.

    (a) (b) (c) (d)

    FIGURE 9. The wheeze-detection process. In (a), the top row corresponds to normal signals, while the bottom two rows correspond to wheeze signals. (a) Delay embeddings. (b) Various samplings of a wheeze. (c) A triangulation of a wheeze. (d) A triangulation with large edges removed.

    0.6

    0.4

    0.2

    0

    –0.2

    –0.4

    –0.6

    –0.8 –0.5 0 0.5

    0.6

    0.4

    0.2

    0

    –0.2

    –0.4

    –0.6

    –0.8 –0.5 0 0.5

    X (t+ τ)

    X (t ) X (t )

    X (t+ τ)

    (a) (b)

    (c) (d)

    10BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

  • Persistent homology

    11BIOSTATISTICSBen Cassidy | b.cassidy@columbia.edu

    PERSISTENT TOPOLOGY OF DATA 65

    Figure 3. A sequence of Rips complexes for a point cloud data set representing an annulus. Upon increasing ϵ, holes appear and disappear. Which holes are real and which are noise?

    assume a rudimentary knowledge of homology, as is to be found in, say, Chapter 2 of [15].

    Despite being both computable and insightful, the homology of a complex asso- ciated to a point cloud at a particular ϵ is insufficient: it is a mistake to ask which value of ϵ is optimal. Nor does it suffice to know a simple ‘count’ of the number and types of holes appearing at each parameter value ϵ. Betti numbers are not enough. One requires a means of declaring which holes are essential and which can be safely ignored. The standard topological constructs of homology and homotopy offer no such slack in their strident rigidity: a hole is a hole no matter how fragile or fine.

    2.1. Persistence. Persistence, as introduced by Edelsbrunner, Letscher, and Zomorodian [12] and refined by Carlsson and Zomorodian [22], is a rigorous re- sponse to this problem. Given a parameterized family of spaces, those topological features which persist over a significant parameter range are to be considered as signal with short-lived features as noise