graph data structures and graph neural networks in high

59
TRACKING WITH GRAPHS XIANGYANG JU DANIEL MURNANE ON BEHALF OF THE EXA.TRKX COLLABORATION 1

Upload: others

Post on 30-Dec-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph Data Structures and Graph Neural Networks in High

TRACKINGWITH

GRAPHS

XIANGYANG JU

DANIEL MURNANE

ON BEHALF OF THE

EXA.TRKX COLLABORATION 1

Page 2: Graph Data Structures and Graph Neural Networks in High

OVERVIEW

1. Why ML/GNNs for High Energy Physics?

2. GNN applications in HEP/ExatrkX

3. ML Tracking Pipeline

4. Metric Learning

Code Walkthrough (Pytorch)

5. Doublet & Triplet GNN

6. DBSCAN for TrackML Score

Code Walkthrough (TensorFlow)

2

Page 3: Graph Data Structures and Graph Neural Networks in High

WHY MACHINE LEARNING FOR TRACKING?

High-luminosity scaling problem, means we need something

to compliment traditional tracking algorithms,

but why graphs?

4

In other words…

Co

mp

uti

ng p

ow

er

Time, Energy, Number of Collisions

Predicted

capacity

Traditional

methods

(scale quadratically)

HL-LHC, 14 TeV

2024 – 2026

6 Billion events/second

Page 4: Graph Data Structures and Graph Neural Networks in High

WHY GRAPHS SPECIFICALLY?

High-luminosity scaling problem, means we need something

to compliment traditional tracking algorithms,

but why graphs?

Graphs can capture inherent sparsity of much physics data

5

Hits to

graphs

Page 5: Graph Data Structures and Graph Neural Networks in High

WHY GRAPHS?

High-luminosity scaling problem, means we need something

to compliment traditional tracking algorithms,

but why graphs?

Graphs can capture inherent sparsity of much physics data

Graphs can capture the manifold and relational structure of much physics data

Conversion to and from graphs can allow manipulation of dimensionality

Graph Neural Networks are booming (i.e. wouldn’t be talking about graphs if there weren’t a wealth of classic

algorithms and NN models for graph data)

Industry research and investment means good outlook for software and hardware optimised for graphs

6

Page 6: Graph Data Structures and Graph Neural Networks in High

APPLICATIONS

TrackML dataset ~ HL-LHC siliconhttps://indico.cern.ch/event/831165/contributions/3717124/

High Granularity Calorimeter datahttps://arxiv.org/abs/2003.11603

LArTPC data ~ DUNE experimenthttps://indico.cern.ch/event/852553/contributions/4059542/

Quantum GNN for Particle Track Reconstructionhttps://indico.cern.ch/event/852553/contributions/4057625/

GNNs on FPGAs for Level-1 Triggerhttps://indico.cern.ch/event/831165/contributions/3758961/

7

Page 7: Graph Data Structures and Graph Neural Networks in High

APPLICATIONS

TrackML dataset ~ HL-LHC siliconhttps://indico.cern.ch/event/831165/contributions/3717124/

High Granularity Calorimeter datahttps://arxiv.org/abs/2003.11603

8

LArTPC data ~ DUNE experimenthttps://indico.cern.ch/event/852553/contributions/4059542/

Quantum GNN for Particle Track Reconstructionhttps://indico.cern.ch/event/852553/contributions/4057625/

GNNs on FPGAs for Level-1 Triggerhttps://indico.cern.ch/event/831165/contributions/3758961/

Page 8: Graph Data Structures and Graph Neural Networks in High

THE EXATRKX PROJECT

MISSION

Optimization, performance and validation studies of ML approaches to the Exascale tracking problem, to enable production-level tracking on next-generation detector systems.

PEOPLE

• Caltech: Joosep Pata, Maria Spiropulu, Jean-Roch Vlimant, Alexander Zlokapa

• Cincinnati: Adam Aurisano, Jeremy Hewes

• FNAL: Giuseppe Cerati, Lindsey Gray, Thomas Klijnsma, Jim Kowalkowski, Gabriel Perdue, Panagiotis Spentzouris

• LBNL: Paolo Calafiura (PI), Nicholas Choma, Sean Conlon, Steve Farrell, Xiangyang Ju, Daniel Murnane, Prabhat

• ORNL: Aristeidis Tsaris

• Princeton: Isobel Ojalvo, Savannah Thais

• SLAC: Pierre Cote De Soux, Francois Drielsma, Kasuhiro Terao, Tracy Usher

9

https://exatrkx.github.io/

Page 9: Graph Data Structures and Graph Neural Networks in High

THE PHYSICAL PROBLEM

• “TrackML Kaggle Competition” dataset

• Generated by HL-LHC-like tracking (ACTS) simulation

• 9000 events to train on

• Each event has up to 100,000 layer hits from around

10,000 particles

• Layers can be hit multiple times by same particle

(“duplicates”)

• Non-particle hits present (“noise”)

10

Page 10: Graph Data Structures and Graph Neural Networks in High

THE PHYSICAL PROBLEM

• “TrackML Kaggle Competition” dataset

• Generated by HL-LHC-like tracking (ACTS) simulation

• 9000 events to train on

• Each event has up to 100,000 layer hits from around

10,000 particles

• Layers can be hit multiple times by same particle

(“duplicates”)

• Non-particle hits present (“noise”)

11

Page 11: Graph Data Structures and Graph Neural Networks in High

THE PHYSICAL PROBLEM

• Need to construct hit data into graph data, i.e. nodes and edges

• Can use geometric heuristics (have used in past: ~45% efficiency, 5% purity)

• To improve performance, use learned embedding construction

• Ideal final result is a “TrackML score”

𝑆 ∈ 0,1

• All hits belonging to same track labelled with same unique label ⇒ 𝑆 = 1

12

Page 12: Graph Data Structures and Graph Neural Networks in High

THE PHYSICAL PROBLEM

• We will follow a particular

event, let’s say event #4692

• We will follow a particular

particle, let’s say particle

#1058360480961134592

• It’s a mouthful, so let’s call

her Diane

13

y

zx

Page 13: Graph Data Structures and Graph Neural Networks in High

TRACKING PIPELINE1. Metric Learning

2. Doublet GNN

3. (Optional) Triplet GNN

4. DBSCAN → TrackML score

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNN

Apply cut for

seeds

DBSCAN for

track labels

14

Page 14: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

𝜂 = 1

𝜂 = 2

𝜂 = 3

𝜂 = 0𝜂 = −1

𝜂 = −2

𝜂 = −3

15

Page 15: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

Barrel only

𝜂 = 1

𝜂 = 2

𝜂 = 3

𝜂 = 0𝜂 = −1

𝜂 = −2

𝜂 = −3Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in

GNN

Filter, convert

to triplets

Train/classify

tripets in

GNN

16

Page 16: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

Barrel only

Adjacency condition

𝜂 = 1

𝜂 = 2

𝜂 = 3

𝜂 = 0𝜂 = −1

𝜂 = −2

𝜂 = −3 Adjacency allows clear ordering

of hits in track, and therefore

ground truth

Can also be used to prune

false positives in the pipeline

17

1

2

3

Page 17: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

Adjacency allows clear ordering

of hits in track, and therefore

ground truth

Can also be used to prune

false positives in the pipeline

18

But what about

skipped layers?

Barrel only

Adjacency condition

1

2

3

Page 18: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

Adjacency allows clear ordering

of hits in track, and therefore

ground truth

Can also be used to prune

false positives in the pipeline

19

But what about

skipped layers?

Barrel only

Adjacency condition

What about the

endcaps?

1

2

3?

3?

Page 19: Graph Data Structures and Graph Neural Networks in High

DATASET

OUR PREVIOUS STATE-OF-THE-ART

Adjacency allows clear ordering

of hits in track, and therefore

ground truth

Can also be used to prune

false positives in the pipeline

20

But what about

skipped layers?

Barrel only

Adjacency condition

What about the

endcaps?

Let’s define clear ground truth and see if

an embedding space can be learned without

knowledge of detector geometry

1

2

3?

3?

Page 20: Graph Data Structures and Graph Neural Networks in High

DATASET

GEOMETRY-FREE TRUTH GRAPH

1. For each particle, order hits by increasing

distance from creation vertex,

𝑅 = 𝑥2 + 𝑦2 + 𝑧2

2. Group by shared layers

3. Connect all combinations from layer 𝐿𝑖 to 𝐿𝑖+1,

where 𝑅𝑖−1 < 𝑅𝑖 < 𝑅𝑖+1

21

Creation vertex

Page 21: Graph Data Structures and Graph Neural Networks in High

DATASET

GEOMETRY-FREE TRUTH GRAPH

22

The real question is: Can such

a specific definition of truth be

learned in an embedded space?

1. For each particle, order hits by increasing

distance from creation vertex,

𝑅 = 𝑥2 + 𝑦2 + 𝑧2

2. Group by shared layers

3. Connect all combinations from layer 𝐿𝑖 to 𝐿𝑖+1,

where 𝑅𝑖−1 < 𝑅𝑖 < 𝑅𝑖+1

Page 22: Graph Data Structures and Graph Neural Networks in High

1. For all hits in barrel, embed features (co-ordinates, cell

direction data, etc.)

into N-dimensional space

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNNApply cut for

seeds

METRIC LEARNING

OUR PREVIOUS STATE OF THE ART

24

Page 23: Graph Data Structures and Graph Neural Networks in High

1. For all hits in barrel, embed features (co-ordinates, cell

direction data, etc.)

into N-dimensional space

2. Associate hits from same tracks as close in N-

dimensional distance (close = within Euclidean

distance r)

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNNApply cut for

seeds

METRIC LEARNING

OUR PREVIOUS STATE OF THE ART

25

Page 24: Graph Data Structures and Graph Neural Networks in High

1. For all hits in barrel, embed features (co-ordinates, cell

direction data, etc.)

into N-dimensional space

2. Associate hits from same tracks as close in N-

dimensional distance (close = within Euclidean

distance r)

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNNApply cut for

seeds

METRIC LEARNING

OUR PREVIOUS STATE OF THE ART

26

r

Page 25: Graph Data Structures and Graph Neural Networks in High

1. For all hits in barrel, embed features (co-ordinates, cell

direction data, etc.)

into N-dimensional space

2. Associate hits from same tracks as close in N-

dimensional distance

3. Score each “target” hit within embedding

neighbourhood against

the “seed” hit at centre = Euclidean distance

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNNApply cut for

seeds

METRIC LEARNING

OUR PREVIOUS STATE OF THE ART

27

Page 26: Graph Data Structures and Graph Neural Networks in High

Architecture specifics

“Comparative” hinge loss

Negatives punished for being in margin

radius

Positives punished for being outside

margin radius (Δ)

Margin = training radius

Train with random pairs with only (r,𝜙,z): 0.3

– 0.5% purity @ 96% efficiency.

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNNApply cut for

seeds

METRIC LEARNING IN EMBEDDING SPACE

OUR PREVIOUS STATE OF THE ART

28

Page 27: Graph Data Structures and Graph Neural Networks in High

METRIC LEARNING IN EMBEDDING SPACE

TOWARDS REALISTIC TRACKING

29

Where do we lose eff/pur:

Most random pair negatives are easy – The Curse of Easy Negatives

Cell shape gives directional information to each hit

Battle against GPU memory – Can only train on subset of pairs

Page 28: Graph Data Structures and Graph Neural Networks in High

METRIC LEARNING IN EMBEDDING SPACE

TOWARDS REALISTIC TRACKING

30

Where do we lose eff/pur:

Most negatives are easy.

Solution: Hard Negative Mining (HNM)

1. Run event through embedding

2. Build radius graph

3. Train loss on all hits within radius=margin=1 of each hit

4. For speed, use sparse custom-built technique + the

Facebook GPU-powered FAISS KNN library

Page 29: Graph Data Structures and Graph Neural Networks in High

METRIC LEARNING IN EMBEDDING SPACE

EFFECTIVE TRAINING

32

Performance improvement with:

• Random Pairs (RP)

• Hard Negative Mining (HNM)

• Positive Weighting (PW)

• Cell Information (CI) - shape

• (Warm-up WU) – Altogether: HPRCW3

Page 30: Graph Data Structures and Graph Neural Networks in High

FILTERING

OUR PREVIOUS STATE OF THE ART

The pipeline so far. We want to get purity (1.3%) up further, so let’s filter these huge graphs.

51

2 Radius

Graph51

2

51

2

51

2

51

2

8

r,y,

zce

ll

Hinge

Loss

51

2

51

2

51

2 Cross

Entropy

Loss

no

rm

no

rm

34

Page 31: Graph Data Structures and Graph Neural Networks in High

FILTERING

TOWARDS REALISTIC TRACKING Just like in embedding, training on random pairs in filter gives poor performance, since the selection mostly gives

easy negatives.

35

Again, solution is HNM: run event through model (without tracking gradients), get hard negatives, run these through

model (now tracking gradients), training only on them.

z z

Distribution of true edgesDistribution of all edges

out of metric learning

Page 32: Graph Data Structures and Graph Neural Networks in High

FILTERING

TOWARDS REALISTIC TRACKING

36

Page 33: Graph Data Structures and Graph Neural Networks in High

FILTERING

TOWARDS REALISTIC TRACKING

Regime Phys. Truth purity

@ 99% efficiency

PID Truth purity

@ 99% efficiency

Vanilla 6.3% 7.8%

Cell info 8.3% 12.8%

Cell info,

layer+batch norm

14.0% 17.4%

Graph size O(1 million edges) O(3 million edges)

Physical Truth:

Truth is defined as edges

between closest hits in

track, on different layers

PID Truth:

Truth is defined as any edge

connecting hits with the

same Particle ID (PID)

Remember

37

Page 34: Graph Data Structures and Graph Neural Networks in High

38

FILTERING

TOWARDS REALISTIC TRACKING

Does it work? Let’s check Diane:

Pretty good!

True positive

False positive

No false negatives

Page 35: Graph Data Structures and Graph Neural Networks in High

39

FILTERING

TOWARDS REALISTIC TRACKING

Does it work? Let’s check Diane:

Not quite as good…

True positive

False positive

False negatives

This is where GNN

comes in

Page 36: Graph Data Structures and Graph Neural Networks in High

FILTERING

ROBUSTNESS

40

(In collaboration with University of Washington:

Aditi Chauhan, Alex Schuy, Ami Oka, David Ho, Shie-Chieh Hsu)

Page 37: Graph Data Structures and Graph Neural Networks in High

FILTERING

ROBUSTNESS

Noise level a proxy for low-pT hits

Robustness of embedding to noise

Not re-trained, that is:

Out-of-the-box penalty around 20%

41

(In collaboration with University of Washington:

Aditi Chauhan, Alex Schuy, Ami Oka, David Ho, Shie-Chieh Hsu)

Page 38: Graph Data Structures and Graph Neural Networks in High

METRIC LEARNING IN EMBEDDED SPACE

CODE WALKTHROUGH

42

Page 39: Graph Data Structures and Graph Neural Networks in High

GRAPH NEURAL NETWORKS

FOR HIGH ENERGY PHYSICS

Can approximate geometry of the physics problem

Are a generalisation of many other machine learning

techniques

E.g. Message passing convolution generalises CNN

from flat to arbitrary geometry

Can learn node (i.e. hit / spacepoint) features and

embeddings, as well as edge (i.e. relational) features and

embeddings

E.g. In practice, for a LHC-like detector environment, join hits

into graph, and iterate through message-passing of hidden

features

43

Page 40: Graph Data Structures and Graph Neural Networks in High

Raw hit data

embedded

Filter likely,

adjacent

doublets

Train/classify

doublets in GNN

Filter, convert to

triplets

Train/classify

triplets in GNN

Apply cut for

seeds

DBSCAN for

track labels

Doublet GNN architecture

Performance

Triplet construction +

performance

44

GRAPH NEURAL NETWORKS

THE AIM

Page 41: Graph Data Structures and Graph Neural Networks in High

45

GRAPH NEURAL NETWORKS

ARCHITECTURES

𝑣0𝑘+1 = 𝜙(𝑒0𝑗

𝑘 , 𝑣𝑗𝑘, 𝑣0

𝑘)MESSAGE

PASSING

𝑣1𝑘 𝑣2

𝑘

𝑣3𝑘 𝑣4

𝑘

𝑣𝑖𝑘 node features

𝑒𝑖𝑗𝑘 edge features

at iteration 𝑘

𝑒01𝑘 𝑒02

𝑘

𝑒03𝑘 𝑒04

𝑘

Page 42: Graph Data Structures and Graph Neural Networks in High

46

GRAPH NEURAL NETWORKS

ARCHITECTURES

𝑣0𝑘+1 = MLP( Σ𝑒0𝑗

𝑘 𝑣𝑗𝑘 , 𝑣0

𝑘 )ATTENTION

GNN

𝑣1𝑘 𝑣2

𝑘

𝑣3𝑘 𝑣4

𝑘

𝑣𝑖𝑘 node features

𝑒𝑖𝑗𝑘 edge score [0, 1]

at iteration 𝑘

𝑒01𝑘 𝑒02

𝑘 = 𝑀𝐿𝑃 [𝑣0𝑘 , 𝑣2

𝑘]

𝑒03𝑘 𝑒04

𝑘Veličković, Petar, et al.

"Graph attention

networks." arXiv preprint

arXiv:1710.10903 (2017).

Page 43: Graph Data Structures and Graph Neural Networks in High

47

GRAPH NEURAL NETWORKS

ARCHITECTURES

𝑣0𝑘+1 = 𝜙(𝑣0

𝑘, Σ𝑒0𝑗𝑘+1)

INTERACTION

NETWORK

𝑣1𝑘 𝑣2

𝑘

𝑣3𝑘 𝑣4

𝑘

𝑣𝑖𝑘 node features

𝑒𝑖𝑗𝑘 edge features

at iteration 𝑘

𝑒01𝑘 𝑒02

𝑘+1 = 𝜙 𝑣0𝑘, 𝑣2

𝑘 , 𝑒02𝑘

𝑒03𝑘 𝑒04

𝑘

Battaglia, Peter, et al. "Interaction

networks for learning about

objects, relations and

physics." Advances in neural

information processing systems.

2016.

Page 44: Graph Data Structures and Graph Neural Networks in High

• We convert from a doublet graph to triplet graph: triplet

edges have direct access to curvature information, therefore

we hypothesise the accuracy should be even better

• Doublets are associated to nodes, triplets are associated to

edges

48

0.99

x1

x2

x4

0.87 0.84

x2x2

x3

GRAPH NEURAL NETWORKS

PERFORMANCE

Page 45: Graph Data Structures and Graph Neural Networks in High

( ) ( )

( )

49

0.99

x1x2

x4

0.87 0.84

x2x2

x3

GRAPH NEURAL NETWORKS

PERFORMANCE

• We convert from a doublet graph to triplet graph: triplet

edges have direct access to curvature information, therefore

we hypothesise the accuracy should be even better

• Doublets are associated to nodes, triplets are associated to

edges

Page 46: Graph Data Structures and Graph Neural Networks in High

( ) ( )

( )

50

0.99

x1x2

x4

0.87 0.84

x2x2

x3

• We convert from a doublet graph

to triplet graph: triplet

edges have direct access to

curvature information, therefore

we hypothesise the accuracy

should be even better

• Doublets are associated to

nodes, triplets are associated to

edges

GRAPH NEURAL NETWORKS

PERFORMANCE

Barrel-only, adjacent-layers

Page 47: Graph Data Structures and Graph Neural Networks in High

( ) ( )

( )

51

0.99

x1x2

x4

0.87 0.84

x2x2

x3

Purity: 𝟗𝟗. 𝟏% ± 𝟎. 𝟎𝟕%

Inference time: ∼ 𝟓 seconds per

event per GPU, split between:

∼ 3 ± 1 seconds for embedding

construction

∼ 2 ± 1 seconds for two GNN

steps and processing0.82

0.84

0.86

0.88

0.9

0.92

0.94

0 0.5 1 1.5 2 2.5 3 3.5 4

Tota

l E

ffic

ien

cy

pT [GeV]

Seed Efficiency (PU=200)

GRAPH NEURAL NETWORKS

TRACK SEEDING PERFORMANCE

Page 48: Graph Data Structures and Graph Neural Networks in High

53

• 0.84 TrackML Score in barrel, no

noise

• Upper limit score out of metric

learning was 0.90, with only barrel,

adjacenct layers, etc.

• Now able to run full detector, with

geometry-free embedding

• Upper limit score out of metric

learning is now at least 0.935

• Can keep pushing this value at the

expense of purity = GPU memory

GRAPH NEURAL NETWORKS

TRACK LABELLING PERFORMANCE

Tra

ck

ML

Sco

re

Barrel-only, adjacent-layers

Page 49: Graph Data Structures and Graph Neural Networks in High

55

GRAPH NEURAL NETWORKS

CHALLENGES

Going to full detector, does the GNN understand our more epistemologically

motivated truth?

Going to full detector, can we fit a whole model on a GPU?

Page 50: Graph Data Structures and Graph Neural Networks in High

56

GRAPH NEURAL NETWORKS

CHALLENGES

Going to full detector, does the GNN understand our more epistemologically

motivated truth?

We will see performance results in the code walkthrough

Going to full detector, can we fit a whole model on a GPU? No.

We need some memory hacks: Checkpointing, Larger GPU, Mixed Precision

Page 51: Graph Data Structures and Graph Neural Networks in High

GRAPH NEURAL NETWORK

MEMORY MANAGEMENT

Half precision

Performance unaffected

Mention no speed

improvements yet,

pending PyTorch (PyTorch

Geometric) Github pull

0.450.4550.460.4650.470.4750.480.4850.490.4950.50.505

0

2

4

6

8

10

12

14

16

32, 8 iter, 1

batch

32, 8 iter, 2

batch

64, 6x, 1

batch

128, 8x, 1

batch

MP

/F

P R

ati

o

Pe

ak

GP

U U

sa

ge

(G

b)

Configuration (hidden features, message passing iters, training batch size)

FP

(GB)

MP

(GB)

57

Page 52: Graph Data Structures and Graph Neural Networks in High

GRAPH NEURAL NETWORK

MEMORY MANAGEMENT

Checkpointing (Pytorch)

Memory improvement

Minimal speed penalty (~1.2x longer training)

We checkpoint each iteration (partial checkpointing)

so there is still a scaling with iterations

Can checkpoint all iterations (maximal checkpointing)

58

No checkpointing

Maximal

checkpointing

Page 53: Graph Data Structures and Graph Neural Networks in High

GRAPH NEURAL NETWORK

CODE WALKTHROUGH

59

Page 54: Graph Data Structures and Graph Neural Networks in High

GRAPH NEURAL NETWORK

PERFORMANCE

60

Full detector, geometry-free construction

Page 55: Graph Data Structures and Graph Neural Networks in High

METRIC LEARNING & GRAPH NEURAL NETWORK PIPELINE

SUMMARY

We handle full detector, noise, geometry-free inference, distributed training,

with care

Can learn embedding space without layer information, provided we equip

training with hard negative mining, cell information, warmup

Can run GNN of full event, provided we equip training with gradient

checkpointing, mixed precision

Can include noise without re-training, at a small (~20%) penalty to purity

61

Page 56: Graph Data Structures and Graph Neural Networks in High

BACKUP

64

Page 57: Graph Data Structures and Graph Neural Networks in High

OUTLOOK

Converging on better architectures (attention, gated RNN, generalising

dense, flat methods to sparse, graph structure - not that the two are

mutually inclusive, there is increasing interest in sparse CNN

techniques, for example)

Dwivedi, Vijay Prakash, et al. "Benchmarking graph neural networks." arXiv preprint arXiv:2003.00982 (2020).

65

Page 58: Graph Data Structures and Graph Neural Networks in High

OUTLOOK Converging on better architectures (attention, gated RNN, generalising

dense, flat methods to sparse, graph structure - not that the two are

mutually inclusive, there is increasing interest in sparse CNN

techniques, for example)

Converging on better methods (sparse operations, triplet graph

structure, fast clustering, approximate NN, piggy-backing off big tech

methods, e.g. Facebook FAISS)

Dwivedi, Vijay Prakash, et al. "Benchmarking graph neural networks." arXiv preprint arXiv:2003.00982 (2020).

Google Trends of “Graph Neural Networks”

x1

x2

x3x4

0.99

x1x2

x3x4

0.87 0.84

x2x2

( )

( ) ( )

Doublet graph to higher-order classification66

Page 59: Graph Data Structures and Graph Neural Networks in High

Converging on better architectures (attention, gated RNN, generalising

dense, flat methods to sparse, graph structure - not that the two are

mutually inclusive, there is increasing interest in sparse CNN

techniques, for example)

Converging on better methods (sparse operations, triplet graph

structure, fast clustering, approximate NN, piggy-backing off big tech

methods, e.g. Facebook FAISS)

Dwivedi, Vijay Prakash, et al. "Benchmarking graph neural networks." arXiv preprint arXiv:2003.00982 (2020).

Google Trends of “Graph Neural Networks”

Converging on better hardware (mixed

precision handling on new GPUs/TPUs,

sparse handling in IPUs , compilability

of graph-structure ML libraries for

FPGA ports, e.g. IEEE HPEC

GraphChallenge)

OUTLOOK

67