data-intensive information processing applications ― session #5
TRANSCRIPT
![Page 1: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/1.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 1/48
![Page 2: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/2.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 2/48
![Page 3: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/3.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 3/48
Today’s Agenda
Graph problems and representations
Parallel breadth-first search
PageRank
![Page 4: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/4.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 4/48
What ’s a graph?
G = (V,E), where
V represents the set of vertices (nodes)
E represents the set of edges (links)
Both vertices and edges may contain additional information
Directed vs. undirected edges
Presence or absence of cycles
Graphs are everywhere:
Hyperlink structure of the Web
Physical structure of computers on the Internet Interstate highway system
Social networks
![Page 5: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/5.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 5/48
Source: Wikipedia (Königsberg)
![Page 6: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/6.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 6/48
Som e Graph Problem s
Finding shortest paths
Routing Internet traffic and UPS trucks
Finding minimum spanning trees Telco laying down fiber
Finding Max Flow
Airline scheduling
Identify “special” nodes and communities
Breaking up terrorist cells, spread of avian flu
par e ma c ng Monster.com, Match.com
...
![Page 7: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/7.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 7/48
Graphs and MapReduc e
Graph algorithms typically involve:
Performing computations at each node: based on node features,edge features, and local link structure
Propagating computations: “traversing” the graph
How do you represent graph data in MapReduce?
How do you traverse a graph in MapReduce?
![Page 8: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/8.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 8/48
Represent ing Graphs
G = (V, E)
Two common re resentations
Adjacency matrix Adjacency list
![Page 9: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/9.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 9/48
Adjacenc y Mat r i c es
Represent a graph as an n x n square matrix M
n = |V|
M ij
= 1 means a link from node i to j
1 2 3 4
1 0 1 0 1 1
2 1 0 1 13
4 1 0 1 0 4
![Page 10: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/10.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 10/48
Adjac enc y Mat r ic es : Cr i t ique
Advantages:
Amenable to mathematical manipulation
Iteration over rows and columns corresponds to computations onoutlinks and inlinks
Lots of zeros for sparse matrices
Lots of wasted space
![Page 11: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/11.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 11/48
Adjacenc y List s
Take adjacency matrices… and throw away all the zeros
1: 2 4
1 2 3 4
1 0 1 0 12: 1, 3, 43: 1
2 1 0 1 1
4: 1, 34 1 0 1 0
![Page 12: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/12.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 12/48
Adjac enc y L is t s : Cr i t ique
Advantages:
Much more compact representation
Easy to compute over outlinks
Disadvantages:
Much more difficult to compute over inlinks
![Page 13: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/13.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 13/48
Single Sourc e Shor t est Pat h
Problem: find shortest path from a source node to one ormore target nodes
Shortest might also mean lowest weight or cost
First, a refresher: Dijkstra’s Algorithm
![Page 14: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/14.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 14/48
Di jk s t ra ’s A lgor i t hm Ex am ple
∞ ∞
10
1
0 2 3 9 4 6
5 7
∞ ∞2
Example from CLR
![Page 15: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/15.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 15/48
Di jk s t ra ’s A lgor i t hm Ex am ple
10∞
10
1
0 2 3 9 4 6
5 7
5 ∞2
Example from CLR
![Page 16: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/16.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 16/48
Di jk s t ra ’s A lgor i t hm Ex am ple
8 14
10
1
0 2 39
4 6
5 7
5 7
2
Example from CLR
![Page 17: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/17.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 17/48
![Page 18: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/18.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 18/48
Di jk s t ra ’s A lgor i t hm Ex am ple
8 9
1
10
1
0 2 39
4 6
5 7
5 7
2
Example from CLR
![Page 19: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/19.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 19/48
Di jk s t ra ’s A lgor i t hm Ex am ple
8 9
10
1
0 2 39
4 6
5 7
5 7
2
Example from CLR
![Page 20: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/20.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 20/48
Single Sourc e Shor t est Pat h
Problem: find shortest path from a source node to one ormore target nodes
Shortest might also mean lowest weight or cost
Single processor machine: Dijkstra’s Algorithm
MapReduce: parallel Breadth-First Search (BFS)
![Page 21: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/21.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 21/48
Find ing t he Shor t es t Pat h
Consider simple case of equal edge weights
Solution to the roblem can be defined inductivel
Here’s the intuition:
Define: b is reachable from a if b is on ad acenc list of a
DISTANCETO(s ) = 0
For all nodes p reachable from s ,
For all nodes n reachable from some other set of nodes M ,DISTANCETO(n ) = 1 + min(DISTANCETO(m ), m ∈ M )
s m 2
m 1
n
…
…
1
d 2
m 3
… d 3
![Page 22: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/22.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 22/48
![Page 23: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/23.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 23/48
![Page 24: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/24.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 24/48
From In t u i t ion t o Algori t hm
Data representation:
Key: node n
Value: d (distance from start), adjacency list (list of nodes
reachable from n )
Initialization: for all nodes except for start node, = ∞
Mapper:
∀m ∈ adjacency list: emit (m , d + 1)
Sort/Shuffle
Groups distances by reachable nodes
Reducer: Selects minimum distance path for each reachable node
![Page 25: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/25.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 25/48
![Page 26: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/26.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 26/48
BFS Pseudo-Code
![Page 27: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/27.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 27/48
St opping Cr i t er ion
How many iterations are needed in parallel BFS (equaledge weight case)?
Convince yourself: when a node is first “discovered”,we’ve found the shortest path
Now answer the question...
Six degrees of separation? Practicalities of implementation in MapReduce
![Page 28: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/28.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 28/48
Com par ison t o Di jk s t ra
Dijkstra’s algorithm is more efficient
At any step it only pursues edges from the minimum-cost pathinside the frontier
MapReduce explores all paths in parallel
“ o s o was e
Useful work is only done at the “frontier”
Wh can’t we do better usin Ma Reduce?
![Page 29: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/29.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 29/48
Weigh t ed Edges
Now add positive weights to the edges
Why can’t edge weights be negative?
Simple change: adjacency list now includes a weight w foreach edge
In mapper, emit (m , d + w p ) instead of (m , + 1) for each node m
That’s it?
![Page 30: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/30.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 30/48
St opping Cr i t er ion
How many iterations are needed in parallel BFS (positiveedge weight case)?
Convince yourself: when a node is first “discovered”,we’ve found the shortest path
![Page 31: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/31.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 31/48
Addi t ional Com plex i t ies
search frontier 1
1
r 10
n 1
n 5
n 6 n 7 n 8
n 9 1
s
p q
n 2 n 3
n 4
1
11
1
![Page 32: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/32.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 32/48
St opping Cr i t er ion
How many iterations are needed in parallel BFS (positiveedge weight case)?
Practicalities of implementation in MapReduce
![Page 33: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/33.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 33/48
Graphs and MapReduc e
Graph algorithms typically involve:
Performing computations at each node: based on node features,edge features, and local link structure
Propagating computations: “traversing” the graph
Represent graphs as adjacency lists
Perform local computations in mapper Pass along partial results via outlinks, keyed by destination node
Perform aggregation in reducer on inlinks to a node
Iterate until conver ence: controlled b external “driver”
Don’t forget to pass the graph structure between iterations
![Page 34: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/34.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 34/48
![Page 35: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/35.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 35/48
PageRank : Def ined
Given page x with inlinks t 1…t n , where
C(t) is the out-degree of t
α is probability of random jump
N is the total number of nodes in the graph
∑=
−+⎟ ⎠
⎞⎜⎝
⎛ =
n
i i
i
t C
t PR
N xPR
1 )(
)()1(
1)( α α
X
t 1
t 2
…n
![Page 36: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/36.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 36/48
Com put ing PageRank
Properties of PageRank
Can be computed iteratively
Effects at each iteration are local
Sketch of algorithm:
Start with seed PR i values
Each page distributes PR i “credit” to all pages it links to
Each target page adds up “credit” from multiple in-bound links tocompute PR i+1
Iterate until values converge
![Page 37: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/37.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 37/48
![Page 38: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/38.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 38/48
![Page 39: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/39.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 39/48
Sam ple PageRank I t e rat ion (2 )
n 2 (0.166) n 2 (0.133)Iteration 2
n 1 (0.066).
0.033
0.083.
0.1 0.10.1
n 1 (0.1)
n 4 (0.3)
n 3 (0.166)n 5 (0.3)
0.3 0.166
n 4 (0.2)
n 3 (0.183)n 5 (0.383)
![Page 40: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/40.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 40/48
PageRank in MapReduc e
n 5 [n 1, n 2 , n 3 ]n 1 [n 2 , n 4 ] n 2 [n 3 , n 5 ] n 3 [n 4 ] n 4 [n 5 ]
n 2 n 4 n 3 n 5 n 1 n 2 n 3 n 4 n 5
n 2 n 4 n 3 n 5 n 1 n 2 n 3 n 4 n 5
Reduce
n 5 [n 1, n 2 , n 3 ]n 1 [n 2 , n 4 ] n 2 [n 3 , n 5 ] n 3 [n 4 ] n 4 [n 5 ]
![Page 41: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/41.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 41/48
PageRank Pseudo-Code
![Page 42: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/42.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 42/48
Com plet e PageRank
Two additional complexities
What is the proper treatment of dangling nodes?
How do we factor in the random jump factor?
Solution:
Second pass to redistribute “missing PageRank mass” andaccount for random jumps
⎞⎛ ⎞⎛
is Pa eRank value from before ' is u dated Pa eRank value
⎟ ⎠
⎜⎝
+−+⎟ ⎠
⎜⎝
= pGG
p )1(' α α
|G| is the number of nodes in the graph m is the missing PageRank mass
![Page 43: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/43.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 43/48
PageRank Convergenc e
Alternative convergence criteria
Iterate until PageRank values don’t change
Iterate until PageRank rankings don’t change
Fixed number of iterations
![Page 44: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/44.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 44/48
Beyond PageRank
Link structure is important for web search
PageRank is one of many link-based features: HITS, SALSA, etc.
One of many thousands of features used in ranking…
Adversarial nature of web search
Link spamming
Spider traps
Keyword stuffing …
![Page 45: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/45.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 45/48
Ef f ic ient Graph Algor i t hm s
Sparse vs. dense graphs
Gra h to olo ies
![Page 46: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/46.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 46/48
Figure from: Newman, M. E. J. (2005) “Power laws, Pareto
distributions and Zipf's law.” Contemporary Physics 46:323–351.
![Page 47: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/47.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 47/48
![Page 48: Data-Intensive Information Processing Applications ― Session #5](https://reader034.vdocuments.mx/reader034/viewer/2022052319/577d38df1a28ab3a6b98a964/html5/thumbnails/48.jpg)
8/14/2019 Data-Intensive Information Processing Applications ― Session #5
http://slidepdf.com/reader/full/data-intensive-information-processing-applications-session-5 48/48
Questions?
Source: Wikipedia (Japanese rock garden)