greg hogan – to petascale and beyond- apache flink in the clouds
TRANSCRIPT
![Page 2: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/2.jpg)
My Project Background• Apache Flink Committer since 2016
• Progressing from Hadoop MapReduce
enticed by tuples, fluent API, and scalability
enduring for community and growth
• Commits for performance and scalability
• Contributed Gelly graph generators and native algorithms
![Page 3: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/3.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 4: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/4.jpg)
Why batch? Why Gelly?• Batch
strong similarities between batch and streaming
benchmarking is in essence batch
• Gelly
diverse, scalable algorithms
that stress distributed architectures
![Page 5: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/5.jpg)
What is Gelly?Graph API:
Graph, Vertex, Edge, … BipartiteGraph, BipartiteEdge (FLINK-2254)
from*, getTriplets, groupReduceOn*, joinWith*, map*, reduceOn*
public class Graph<K, VV, EV> { private final ExecutionEnvironment context; private final DataSet<Vertex<K, VV>> vertices; private final DataSet<Edge<K, EV>> edges;
public class Vertex<K, V> extends Tuple2<K, V> {
public class Edge<K, V> extends Tuple3<K, K, V>{
![Page 6: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/6.jpg)
What is Gelly?Iterative graph algorithm models:
vertex-centric (pregel)
scatter-gather (spargel)
gather-sum-apply
public abstract void compute(Vertex<K, VV> vertex, MessageIterator<Message> messages);
public abstract void sendMessages(Vertex<K, VV> vertex) throws Exception; public abstract void updateVertex(Vertex<K, VV> vertex, MessageIterator<Message> inMessages);
public abstract M gather(Neighbor<VV, EV> neighbor); public abstract M sum(M arg0, M arg1); public abstract void apply(M newValue, VV currentValue);
![Page 7: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/7.jpg)
What is Gelly?• Generate scale-free graphs independent of parallelism
Complete, Cycle, Grid, Hypercube, Path, RMat, Star
• Composable building blocks for directed and undirected graphs
degree annotate, simplify, transform, …
• Algorithms (DataSet) and analytics (accumulators)
• Checksum validation
![Page 8: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/8.jpg)
R-MAT: Recursive Matrix• CMU: Chakrabarti, Zhan, Faloutsos
power-law degree, “community” structure, small diameter
parsimony, realism, generation speed
• vertex count: 2scale
• edge count : 2scale * edge factor
• bit probabilities: a + b + c + d = 1
• random noise, clip-and-flip, bipartite, …
![Page 9: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/9.jpg)
GlossaryV SVertex Edge
Open Triplet Closed Triplet / Triangle
T
U
V W
U
V W
![Page 10: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/10.jpg)
Triads
![Page 11: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/11.jpg)
Graph500 MetricsScale Vertices Edges Max Degree Triplets Triplets
16 46,826 955,507 9,608 33,319,057 619,718,882
20 646,203 16,084,983 64,779 1,225,406,293 35,034,305,809
24 8,869,117 263,432,942 406,382 41,744,066,881 1,782,644,764,135
28 121,233,960 4,259,425,690 2,457,580 1,357,806,114,141 84,905,324,934,756
32 1,652,616,155 68,460,935,436 14,455,055 43,031,369,386,717 3,870,783,341,130,190
![Page 12: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/12.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 13: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/13.jpg)
• Measures connectedness of a vertex’s neighborhood
• Ranges from 0.0 to 1.0:
0.0: no edges connecting neighbors (star graph or tree)
1.0: each pair of neighbors is connected (clique)
• Each connection between neighbors forms a triangle
Clustering Coefficient
![Page 14: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/14.jpg)
Clustering Coefficient• Local Clustering Coefficient is fraction of edges between neighbors
• Global Clustering Coefficient is fraction of connected triplets
![Page 15: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/15.jpg)
• Thomas Schank: Algorithmic Aspects of Triangle-Based Network Analysis
• Join all edges with all (open) triplets
• Naive algorithm discovers each triangle at all three vertices;
instead look for triangles at the vertex with lowest degree
Enumerating Triangles
V W
U
V W
![Page 16: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/16.jpg)
Implicit Object Reuse
![Page 17: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/17.jpg)
Implicit Operator ReuseGlobal Clustering Coefficient
![Page 18: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/18.jpg)
Implicit Operator Reuse
Local Clustering Coefficient
![Page 19: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/19.jpg)
Implicit Operator Reuse
Local Clustering Coefficient
Global Clustering Coefficient
![Page 20: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/20.jpg)
Implicit Operator Reuse
Local Clustering Coefficient
Global Clustering Coefficient
gcc = graph .run(new GlobalClusteringCoefficient<LongValue, NullValue, NullValue>()); lcc = graph .run(new LocalClusteringCoefficient<LongValue, NullValue, NullValue>());
Graph
![Page 21: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/21.jpg)
Implicit Operator Reuse• Three generations of operator reuse
1. explicit reuse via get/set on intermediate DataSets
2. Delegate using code generation
3. NoOpOperator as placeholder in Flink 1.2
• Algorithms extend GraphAlgorithmWrappingDataSet/Graph
and define and implement a mergeable configuration
![Page 22: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/22.jpg)
GraphAnalytic• Terminal; results retrieved via accumulators
• Defers execution to the user to allow composing multiple analytics and algorithms into a single program
public interface GraphAnalytic<K, VV, EV, T> { T getResult();
T execute() throws Exception;
T execute(String jobName) throws Exception; GraphAnalytic<K, VV, EV, T> run(Graph<K, VV, EV> input); }
![Page 23: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/23.jpg)
Similarity Measures• Common Neighbors
• Jaccard Index is common neighbors / distinct neighbors
• Adamic-Adar is sum over weighted-degree of common neighbors
![Page 24: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/24.jpg)
Common Neighborsgraph.getEdges()
.groupBy(0).crossGroup(…)
.groupBy(0, 1).sum(2)
C D
B
A
CNB,C=1
CNC,D=1
CNB,D=1
GroupCross Input GroupCross Input
<A,B>, <A,C>, <A,D> <B,C>, <B,D>, <C,D>
<B,A>
<C,A>
<D,A>
![Page 25: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/25.jpg)
GroupedCross• FLINK-1267: pair-wise compare or combine all elements of a group;
currently can
groupReduce: consume and store the complete iterator in memory and build all pairs
self-Join: the engine builds all pairs of the full symmetric Cartesian product. u
v
![Page 26: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/26.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
uv
![Page 27: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/27.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
• As linear operators:
• process fixed width slices
uu } 64
} 64} 64} 64
![Page 28: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/28.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
• As linear operators:
• process fixed width slices
… generating slices is still quadratic!
uu } 64
} 64} 64} 64
![Page 29: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/29.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
• As linear operators:
3. process fixed width slices
uu } 64
} 64} 64} 64
![Page 30: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/30.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
• As linear operators:
1. emit # slices for each element
3. process fixed width slices
uu } 64
} 64} 64} 64
![Page 31: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/31.jpg)
GroupedCross• Hashes evenly partition linear outputs, but here quadratic
• As linear operators:
1. emit # slices for each element
2. emit 1..# slice for each element
3. process fixed width slices
uu } 64
} 64} 64} 64
![Page 32: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/32.jpg)
Jaccard Index / Adamic-Adar
• Jaccard Index edges annotated with target degree: <s, t, d(t)>
• Adamic-Adar edges annotated with inverse logarithm of source degree: <s, t, 1/log(d(s))>
• groupBy, crossGroup, groupBy, sum scores as Common Neighbors
![Page 33: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/33.jpg)
Flink Serialization• “Think like an element”
• Flink aggressively serializes even when data fits in memory
MapDriver:
• Complex types are antithetical to performance
don’t stream ArrayList, HashSet, …
do use Flink Value types and ValueArray types (FLINK-3695)
while (this.running && ((record = input.next(record)) != null)) { numRecordsIn.inc(); output.collect(function.map(record));}
![Page 34: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/34.jpg)
Modern Memory Model• Processor registers – the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size
• Cache
• Level 0 (L0) Micro operations cache – 6 KiB in size
• Level 1 (L1) Instruction cache – 128 KiB in size
• Level 1 (L1) Data cache – 128 KiB in size. Best access speed is around 700 GiB/second
• Level 2 (L2) Instruction and data (shared) – 1 MiB in size. Best access speed is around 200 GiB/second
• Level 3 (L3) Shared cache – 6 MiB in size. Best access speed is around 100 GB/second
• Level 4 (L4) Shared cache – 128 MiB in size. Best access speed is around 40 GB/second
• Main memory (Primary storage) – Gigabytes in size. Best access speed is around 10 GB/second.
• Disk storage (Secondary storage) – Terabytes in size. As of 2013, best access speed is from a solid state drive is about 600 MB/second
• Nearline storage (Tertiary storage) – Up to exabytes in size. As of 2013, best access speed is about 160 MB/second
• Offline storage
https://en.wikipedia.org/wiki/Memory_hierarchy
![Page 35: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/35.jpg)
Object Reuse• Performance boost when serializing/deserializing simple types
• Inputs may be modified between function calls;
if storing inputs must make a copy
• Output may be modified pre-serialization by chained operators
• Configured globally; open discussion for using annotations
// Set up the execution environmentfinal ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();env.getConfig().enableObjectReuse();
![Page 36: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/36.jpg)
CopyableValue• Implemented by Flink’s value types (LongValue, StringValue, …)
• Create new or copy to list of cached objects
if (visitedCount == visited.size()) { visited.add(edge.f1.copy());} else { edge.f1.copyTo(visited.get(visitedCount));}
public interface CopyableValue<T> extends Value {
int getBinaryLength(); void copyTo(T target);
T copy();
void copy(DataInputView source, DataOutputView target) throws IOException;}
![Page 37: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/37.jpg)
Hints
• CombineHint.[HASH|SORT|NONE]
• JoinHint.[BROADCAST/REPARTITION_HASH_FIRST/SECOND
|REPARTITION_SORT_MERGE]
• @ForwardedFields[|First|Second]
![Page 38: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/38.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 39: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/39.jpg)
Amazon Web Services• c4.8xlarge:
• 36 vcores
• 60 GiB memory
• 2 NUMA cores (FLINK-3163)
• 5 x 200 GiB EBS volumes; kernel config: xen_blkfront.max=256
• Hourly Spot pricing ~$0.30 - $0.36 plus $0.14/hr EBS
• 800 MB/s EBS
• 10 Gbps networking + 4 Gbps dedicated EBS
![Page 40: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/40.jpg)
Memory Allocation
• User defined functions
• TaskManager managed memory
• Network buffers
• Readhead buffers for spilled files in merge-sort
![Page 41: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/41.jpg)
My flink-conf.yamljobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 jobmanager.heap.mb: 1024 taskmanager.heap.mb: 18000 taskmanager.numberOfTaskSlots: 18 taskmanager.memory.preallocate: true parallelism.default: 36 jobmanager.web.port: 8081 webclient.port: 8080 taskmanager.network.numberOfBuffers: 25920 taskmanager.tmp.dirs: /volumes/xvdb/tmp:/volumes/xvdc/tmp:/volumes/xvdd/tmp:…
akka.ask.timeout: 1000 s akka.lookup.timeout: 100 s jobmanager.web.history: 25 taskmanager.memory.fraction: 0.9 taskmanager.memory.off-heap: true taskmanager.runtime.hashjoin-bloom-filters: true taskmanager.runtime.max-fan: 8192
![Page 42: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/42.jpg)
Configuring Network Buffers• Ignore heuristic as recommended in documentation
“#slots-per-TM^2 * #TMs * 4”
• Instead introspect with statsd reporter
nc -l -u -k 0.0.0.0 <port> | grep “AvailableMemorySegments”
AvailableMemorySegments, TotalMemorySegments in Flink 1.2
• May show up in the web UI
![Page 43: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/43.jpg)
Python statsd server#!/usr/bin/env python
import socket import sys import time
UDP_IP = '' UDP_PORT = int(sys.argv[1])
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.bind((UDP_IP, UDP_PORT))
while True: data, addr = sock.recvfrom(4096) print('{:.6f} {}'.format(time.time(), data))
![Page 44: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/44.jpg)
“Little” Parallelism• Number of buffers per TM increases linearly with size of cluster so
can run out of buffers or consume too much memory
18 * (36 * 64) * 32 KiB = 1.36 GB for a single shuffle
• Typically have small tasks early and late and large tasks in middle
• Flink runs slowly when parallelism mismatched to size of data
• Must be paired with a scheduler which evenly allocates slots among TaskManagers
![Page 45: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/45.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 46: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/46.jpg)
Cloudsort• sortbenchmark.org: lowest cost to sort 100 TB in a public cloud
an “I/O” benchmark, no compression permitted
• Input and output on persistent storage
replicated storage outside the cluster
• No discounts!
• Amazon has better compute, Google has better networking
![Page 47: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/47.jpg)
SchnellSort• Everything at https://github.com/greghogan/flink-cloudsort
• Custom IndyRecord type
• 1 x c4.4xlarge master, 128 x c4.4xlarge workers
• 3 x 255 GiB EBS for tmp directories
• 1 GB input files in S3
• 256 MB output files in S3
![Page 48: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/48.jpg)
Sort Cost
![Page 49: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/49.jpg)
Java SSL Performance• Cannot choose SSLEngine for HDFS stack
solved with intrinsics in jdk8-b112 and Skylake’s Intel SHA Extensions
• Galois/Counter Mode (GCM) can be disabled
https://docs.oracle.com/javase/8/docs/technotes/guides/security/PolicyFiles.html
• SHA-1 or MD5 are integral to SSL and cannot be disabled
compare with `openssl speed md5 sha`
used AWS CLI
![Page 50: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/50.jpg)
Netty Configuration • Linux TCP/IP default buffers
• Netty configuration in flink-conf.yaml
$ cat /proc/sys/net/ipv4/tcp_[rw]mem 4096 87380 3813280 4096 20480 3813280
taskmanager.net.num-arenas: 16 taskmanager.net.server.numThreads: 16 taskmanager.net.client.numThreads: 16 taskmanager.net.sendReceiveBufferSize: 67108864
![Page 51: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/51.jpg)
Slow uploads• “Counting Triangles and the Curse of the Last Reducer”
kill and retry slow downloads and uploads
write files to /dev/shm and upload using “thread-pool”
• Configurable timeout
120 seconds for input
60 seconds for output
![Page 52: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/52.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 53: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/53.jpg)
Java Flight Recorder• Must edit flink-daemon.sh in order to rotate log file
• Can concatenate files from multiple TaskManagers
• Open jfr output files in JDK-provided jmc
jfr=“${FLINK_LOG_DIR}/flink-${FLINK_IDENT_STRING}-${DAEMON}-${id}-${HOSTNAME}.jfr" JFR_ARGS=“-XX:+UnlockCommercialFeatures -XX:+FlightRecorder \ -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints \ -XX:FlightRecorderOptions=defaultrecording=true,maxage=0,disk=true,dumponexit=true,dumponexitpath=${jfr}”
… rotateLogFile $jfr
… $JAVA_RUN $JFR_ARGS …
java oracle.jrockit.jfr.tools.ConCatRepository . -o <filename>.jfr
![Page 54: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/54.jpg)
![Page 55: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/55.jpg)
Macro Benchmarks• Performance test new features and stress test Flink releases
https://github.com/greghogan/flink-benchmark
multiple time samples on increasing scale; time-weighted
• Profiling drives benchmarking: chasing the “1%”
• What metrics can be collected and analyzed?
{“algorithm”:”AdamicAdar","idType":"STRING","scale":12,"runtime_ms":634,"accumulators":{...} {“algorithm":"TriangleListingUndirected","idType":"INT","scale":16,"runtime_ms":813,"accumulators":{...} {“algorithm":"AdamicAdar","idType":"LONG","scale":12,"runtime_ms":410,"accumulators":{...} {“algorithm":"HITS","idType":"STRING","scale":16,"runtime_ms":1693,"accumulators":{...} {“algorithm”:”JaccardIndex”,”idType":"LONG","scale":12,"runtime_ms":431,"accumulators":{...}
![Page 56: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/56.jpg)
AgendaGraphs with Gelly
Native algorithms
Configuring at scale
Sort benchmark
Profiling and benchmarking
In development
![Page 57: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/57.jpg)
Custom Types• FLINK-3042: Define a way to let types create their own TypeInformation
• @TypeInfo
-> TypeInfoFactory
-> TypeInformation
-> serializer/comparator and more
• Efficient serialization of arrays in FLINK-3695: ValueArray types
![Page 58: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/58.jpg)
ValueArray<T>@TypeInfo(ValueArrayTypeInfoFactory.class) public interface ValueArray<T> …
public class ValueArrayTypeInfoFactory<T> extends TypeInfoFactory<ValueArray<T>> { @Override public TypeInformation<ValueArray<T>> createTypeInfo(…) { return new ValueArrayTypeInfo(genericParameters.get("T")); } }
![Page 59: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/59.jpg)
ValueArrayTypeInfo<T>public class ValueArrayTypeInfo<T extends ValueArray> extends TypeInformation<T> implements AtomicType<T> { private final Class<T> typeClass; @Override @SuppressWarnings("unchecked") public TypeSerializer<T> createSerializer(ExecutionConfig executionConfig) { if (IntValue.class.isAssignableFrom(typeClass)) { return (TypeSerializer<T>) new CopyableValueSerializer(IntValueArray.class); } else… @SuppressWarnings({ "unchecked", "rawtypes" }) @Override public TypeComparator<T> createComparator(boolean sortOrderAscending, ExecutionConfig executionConfig) { if (IntValue.class.isAssignableFrom(typeClass)) { return (TypeComparator<T>) new CopyableValueComparator(sortOrderAscending, IntValueArray.class); } else …
![Page 60: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/60.jpg)
ArrayableValue• LongValueArray, StringValueArray, …
• Initial use cases:
maximum value or top-K for pairwise similarity scores
hybrid adjacency-list model:u v
u w
u x
y z
u v, w, x
y z
u v, w
u x
y z
Edge List Adjacency List Hybrid
![Page 61: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/61.jpg)
Iteration Intermediate Output• For many iterative algorithms the output is ∅
algorithm output is the output of intermediate operator
breadth-first search, K-Core decomposition, K-Truss, …
• Delta iteration solution set is blocking, doesn’t scale
Iteration Head
Iteration Tail
Discarding Output Format
Operator
![Page 62: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/62.jpg)
Coding to Interfaces• Vertex/Edge as interface or abstract class
subclasses define value fields
eliminate “value packing”
• Requires code generation; must track fields from input to output
• Make algorithms truly modular
• Optimizer can prune unused fields
![Page 63: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/63.jpg)
Why batch? Why Gelly?
• A framework for building great algorithms
• Scaling to 100s of algorithms
• As an out-of-the-box tool for graph analysis
![Page 64: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/64.jpg)
?
![Page 65: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/65.jpg)
Images• Squirrel: https://flink.apache.org/img/logo/png/1000/flink_squirrel_1000.png
• Clouds: http://www.publicdomainpictures.net/pictures/120000/velka/large-cumulus-clouds-1.jpg
• Tree: https://motivatedbynature.com/wp-content/uploads/2016/04/tree_PNG2517.png
• RMat: http://www.cs.cmu.edu/~christos/PUBLICATIONS/siam04.pdf
• Triads: http://www.educa.fmf.uni-lj.si/datana/pub/networks/pajek/triade.pdf
![Page 66: Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds](https://reader031.vdocuments.mx/reader031/viewer/2022030317/587138441a28abf0568b6305/html5/thumbnails/66.jpg)
Agenda Images• Fernsehturm Berlin: http://readysettrek.com/wp-content/uploads/2015/09/berlin-
attractions.jpg
• Reichstag: https://upload.wikimedia.org/wikipedia/commons/c/c7/Reichstag_building_Berlin_view_from_west_before_sunset.jpg
• Brandenburg Gate: https://images2.alphacoders.com/479/479507.jpg
• Schloss Charlottenburg : https://upload.wikimedia.org/wikipedia/commons/e/ec/Le_château_de_Charlottenburg_(Berlin)_(6340508573).jpg
• Memorial: https://s-media-cache-ak0.pinimg.com/736x/45/fa/85/45fa85018367cd88d4ffc0f62416c804.jpg
• Olympic Stadium: http://sumfinity.com/wp-content/uploads/2014/03/Olympic-Stadium-Berlin.jpg