Definition Data Structures Applications Problems
Graph Pattern Matching Partitioning Distribution
AKKA
Graph Analytics
Definition: A "graph" is a collection of • "vertices" or "nodes" • " edges " that connect pairs of vertices.
A graph may be undirected, meaning that there is no distinction between the two vertices associated with each edge, or its edges may be directed from one vertex to another
Fast review on Graph and graph theory
1 231 23
The link structure of a website could be represented by a directed graph. The vertices are the web pages available at the website and a directed edge from page A to page B exists if and only if A contains a link to B. Mathematical PageRanks for a simple network, expressed as percentages. (Google uses a logarithmic scale.) Page C has a higher PageRank than Page E, even though there are fewer links to C; the one link to C comes from an important page and hence is of high value.
graph theory is also used to study molecules in chemistry and physics. In condensed matter physics, the three dimensional structure of complicated Simulated atomic structures.
Image Processing, crime detection Antology …
Applications
The data structure used depends on both the graph structure and the algorithm used for manipulating the graph. Theoretically one can distinguish between list and matrix structures but in concrete applications the best structure is often a combination of both. List structures are often preferred for sparse graphs as they have smaller memory
requirements. Matrix structures on the other hand provide faster access for some applications but
can consume huge amounts of memory.
Graph-theoretic data structures
List Incidence list
The edges are represented by an array containing pairs (tuples if directed) of vertices (that the edge connects) and possibly weight and other data. Vertices connected by an edge are said to be adjacent.
((a,b),(c,d),…)
Adjacency listMuch like the incidence list, each vertex has a list of which vertices it is adjacent to. This causes redundancy in an undirected graph: for example, if vertices A and B are adjacent, A's adjacency list contains B, while B's list contains A. Adjacency queries are faster, at the cost of extra storage space.
a adjacent to
b,c
b adjacent to
a,c
c adjacent to
a,b
Matrix Incidence matrix
The graph is represented by a matrix of size |V | (number of vertices) by |E| (number of edges) where the entry [vertex, edge] contains the edge's endpoint data (simplest case: 1 - incident, 0 - not incident).
e1 e2 e3 e4
1
2
3
4 Adjacency matrix
This is an n by n matrix A, where n is the number of vertices in the graph. If there is an edge from a vertex x to a vertex y, then the element is 1 (or in general the number of xy edges), otherwise it is 0. In computing, this matrix makes it easy to find subgraphs, and to reverse a directed graph.
Matrix
Distance matrixA symmetric n by n matrix D, where n is the number of vertices in the graph. The element is the length of a shortest path between x and y; if there is no such path = infinity. It can be derived from powers of A
a b c d e f
a 0 184 222 177 216 231
b 184 0 45 123 128 200
c 222 45 0 129 121 203
d 177 123 129 0 46 83
e 216 128 121 46 0 83
f 231 200 203 83 83 0
Matrix
Laplacian matrix or "Kirchhoff matrix" or "Admittance matrix" This is defined as D − A, where D is the diagonal degree matrix. It explicitly contains both adjacency information and degree information. (However, there are other, similar matrices that are also called "Laplacian matrices" of a graph.)
1) Enumeration2) Subgraphs, induced subgraphs, and minors3) Graph coloring4) Route problems5) Network flow6) Visibility graph problems7) Covering problems8) Graph classes
Problems in graph theory
Enumeration describes a class of combinatorial enumeration problems in which one must count undirected or directed graphs of certain types, typically as a function of the number of vertices of the graph.
Application: Enumeration of molecules has been studied for over a century and
continues to be an active area of research. The typical approach to enumerating chemical structures has been based
on constructive assembly.
It is list of all free trees on 2,3,4 labeled vertices:
tree with 2 vertices, trees with 3 vertices, trees with 4 vertices.
1. Enumeration
2.1. Subgraphs:
A common problem, called the subgraph isomorphism problem, is finding a fixed graph as a subgraph in a given graph. The subgraph isomorphism problem is a computational task in which two graphs G and Q are given as input, and one must determine whether G contains a subgraph that is isomorphic to Q. Subgraph isomorphism is a generalization of both the maximum clique problem and the problem of testing whether a graph contains a Hamiltonian cycle, and is therefore NP-complete.
clique problem: Finding the largest complete graph is called the clique problem. The term "clique" and the problem of algorithmically listing cliques both come from the social sciences, where complete subgraphs are used to model social cliques, groups of people who all know each other.In computer science, the clique problem refers to any of the problems related to finding particular complete subgraphs ("cliques") in a graph, i.e., sets of elements where each pair of elements is connected.
2. Subgraphs, induced subgraphs, and minors
2.2 Induced subgraphs:
some important graph properties are hereditary with respect to induced subgraphs, which means that a graph has a property if and only if all induced subgraphs also have it. Finding maximal induced subgraphs of a certain kind is also often NP-complete. Finding the largest edgeless induced subgraph, or independent set, called
the independent set problem An independent set or stable set is a set of vertices in a graph, no two of
which are adjacent. That is, it is a set I of vertices such that for every two vertices in I, there is no edge connecting the two. The size of an independent set is the number of vertices it contains.
The graph of the cube has
six different maximal independent sets,
shown as the red vertices.
2. Subgraphs, induced subgraphs, and minors
2.3. Minors: The minor containment problem, is to find a fixed graph as a minor of a given graph. A minor or subcontraction of a graph is any graph obtained by taking a subgraph and contracting some (or no) edges. Many graph properties are hereditary for minors, which means that a graph has a property if and only if all minors have it too
A graph is planar if it contains as a minor neither the
complete bipartite graph (Three-cottage problem) nor the
complete graph . Graph can be drawn in such a way that
no edges cross each other. Such a drawing is called a plane
graph or planar embedding of the graph.
Three-cottage problem: water, gas, and electricity, the (three) utilities problem: Suppose there are three cottages on a plane and each needs to be connected to the gas, water, and electric companies. Using a third dimension or sending any of the connections through another company or cottage is disallowed. Is there a way to make all nine connections without any of the lines crossing each other?
The utility graph K3,3 K3,3 drawn with only one crossing.
2. Subgraphs, induced subgraphs, and minors
Many problems have to do with various ways of coloring graphs, for example:
1) The four-color theorem: In mathematics, the four color theorem, or the four color map theorem states that, given any separation of a plane into contiguous regions, producing a figure called a map, no more than four colors are required to color the regions of the map so that no two adjacent regions have the same color.
2) The strong perfect graph theorem: In graph theory, a perfect graph is a graph in which the chromatic number of every induced subgraph equals the size of the largest clique of that subgraph. Perfect graphs are the same as the Berge graphs, graphs that have no odd-length induced cycle or induced complement of an odd cycle.
The chromatic polynomial counts the number of ways a graph can be colored using no more than a given number of colors.
3. Graph coloring
The Paley graph of order 9, colored with three colors and showing a clique of three vertices. In this graph and each of its induced subgraphs the chromatic number equals the clique number, so it is a perfect graph.
3) The total coloring conjecture (unsolved): In graph theory, total coloring is a type of coloring on the vertices and edges of a graph. When used without any qualification, a total coloring is always assumed to be proper in the sense that no adjacent vertices, no adjacent edges, and no edge and its endvertices are assigned the same color. The total chromatic number χ″(G) of a graph G is the least number of colors needed in any total coloring of G.
4) The Erdős–Faber–Lovász conjecture (unsolved)
5) The list coloring conjecture (unsolved)
6) The Hadwiger conjecture (graph theory) (unsolved)
3. Graph coloring
4.1. Hamiltonian path and cycle problems:
Hamiltonian path problem and the Hamiltonian cycle problem are problems of determining whether a Hamiltonian path or a Hamiltonian cycle exists in a given graph Hamiltonian path: a Hamiltonian path is a path in an undirected graph that visits each vertex
exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a Hamiltonian path that is a cycle
4.2. Minimum spanning tree: In an undirected graph, a spanning tree of that graph is a subgraph that connects all the vertices together.
4.3. Route inspection problem : route inspection problem is to find a shortest closed path or circuit that visits every edge of a (connected) undirected graph
4. Route problems
4.4. Seven Bridges of Königsberg: The problem was to find a walk through the
city that would cross each bridge once and only once
4.5. Shortest path problem: the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized.
4.6. Steiner tree: problem in combinatorial optimization, which may be formulated in a number of settings, with the common part being that it is required to find the shortest interconnect for a given set of objects
4.7. Three-cottage problem
4.8. Traveling salesman problem : Given a list of cities and their pairwise distances, the task is to find the shortest possible route that visits each city exactly once and returns to the origin city
4. Route problems
Graph pattern matching is often defined in terms of subgraph isomorphism, an NP-complete problem. To lower its complexity, various extensions of graph simulation have been considered instead. Given a pattern graph Q and a data graph G, it is to find all subgraphs of G that match Q.
Graph pattern matching
input images detected features
one-shot matching (26 true) gressive matching (159 true)
1. Isomorphism:In graph theory, an isomorphism of graphs G and Q is a bijection between the vertex sets of G and Q ( Q G )
such that any two vertices u and v of G are adjacent in G if and only if ƒ(u) and ƒ(v) are adjacent in Q A bijection (or bijective function or one-to-one correspondence) is a function giving
an exact pairing of the elements of two sets.
Graph pattern matching
Graph G Graph Q isomorphism
f(a) = 1 f(b) = 6f(c) = 8f(d) = 3f(g) = 5f(h) = 2f(i) = 4f(j) = 7
As observed, it is often too restrictive to catch sensible matches, as it requires matches to have exactly the same topology as a pattern graph. These hinder its applicability in emerging applications such as social networks and crime detection. Simple Simulation :
denoted by Q G≺ , S V⊆ Q × V , where VQ and V are the set of nodes in Q and G, respectively, such that 1. for each (u, v) S∈ , u and v have the same label; 2. for each node u in Q, there exists v in G such that
a) (u, v) S∈ ,
b) for each edge (u, u’)in Q, there exists an edge (v, v’) in G such that (u’, v’) S∈ . (same children)
Graph Simulation
100
300
200
1
5
3
2
4 1
5
3
2
4
TE
ST
ST
Book
Book
Book
Book
TE
ST
ST
TE ST
Book
TE 1
ST 2,3
Book 4,5
Q G
Dual simulation:
denoted by Q ≺D G,
1. if Q G ≺ with a binary match relation S Vq × V ⊆ , 2. for each pair (u, v) S ∈ and each edge (u2, u) in Eq, there exists an
edge (v2, v) in E with (u2, v2) S∈ . (same children and same parents)
Graph Simulation
100
300
2001
5
3
2
41
5
3
2
TE
ST
ST
Book
BookBook
TE
ST
ST
TE ST
Book
Q G
TE 1
ST 2,3
Book 5
More Example of Simple and Dual Simulation
100
200
A
B
1
2
B
B
3
4
B
B
5
6
B
D
7K
8A
A 0, 8, 9
B 1, 2, 3, 4, 5
9A
Simple Simulation
A 0, 8, 9
B 1, 2, 3, 4, 5
Dual Simulation 0
1
A
B
4B
8A
1
2
B
B
3
4
B
5
8A
B
B
0A
0A
3B
QG
More Example of Simple and Dual Simulation
10
30
A
C
A 1
B 2
C 3
D 4 , 5
E 7 , 8 , 11
F 6 , 9, 10
Simple Simulation Q G
40
20
6050
D
B
E F
1
2
A
B
4
3
6
D
C
F
5
D
E
1110
98F
F
E
E7
Strong simulation:
Define strong simulation by enforcing two conditions on simulation : duality and locality.
Balls. For a node v in a graph G and a non-negative integer r, the ball with center v and radius r is a subgraph of G, denoted by ˆG[v, r], such that
1. for all nodes v in ˆG[v, r], the shortest distance dist(v, v) ≤ r,
2. it has exactly the edges that appear in G over the same node set.
denoted by Q ≺ DL G, if there exist a node v in G and a connected subgraph Gs of G such that
3. Q ≺D Gs, with the maximum match relation S;
4. Gs is exactly the match graph w.r.t. S
5. Gs is contained in the ball ˆG[v, dQ], where dQ is the diameter of Q.
Graph Simulation
100
200 4
321
321
Q
P
P
P PP
P
P P P
P 1, 2, 3, 44
31
P
P
P
G
4
21
P
P
4
32
PP
P
More Example of Simple and Dual Simulation
100
200
A
B
1
2
B
A
3
4
B
A
5
6
B
B
0A
QG
1
6B
0
B
A
1
2
B
A
0A
1
2
B
A
3B
2A
3
4
B
A
3
4
B
A
5B
4A
5
6
B
A
5
6
B
B
0A
Example 1:
the Bio has to be recommended by:
a) an HR person;
b) an SE, i.e., the Bio has experience working with SEs; a) The SE is also recommended by an HR person
c) a data mining specialist (DM), as data mining techniques are required for the job.
a) there is an artificial intelligence expert (AI) who recommends the DM and is recommended by a DM.
SE1
HR1
Bio2
Bio1
SE2
HR2
DM2
Al1 DM1
Al2
AI1 DM1 AIk DMk1
Bio3
Bio4
HR
SE
Bio
DM
AI
Graph Simulation
We next present optimization techniques for algorithm Match, by means of Query minimization Dual simulation filtering Connectivity pruning
1. Query minimization: We say that two pattern graphs Q and Q’ are equivalent, denoted by Q ≡ Q’, if they return the same result on any data graph. A pattern graph Q is minimum if it has the least size |Q| (the number of nodes and the number of edges) among all equivalent pattern graphs.
Optimization Techniques
C1 C2
AB1
D1
B2
D2
R
C1
A B1
D1
R
2. Dual simulation filtering. Our second optimization technique aims to avoid redundant checking of balls in the data graph. Most algorithms of graph simulation recursively refine the match relation by identifying and removing false matches. So, we compute the match relation of dual simulation first, and then project the match relation on each ball to compute strong simulation. This both reduces the initial match set sim(v) for each node v in Q and reduces the number of balls . Indeed, if a node v in G does not match any node in Q, then there is no need to consider the ball centered at v.
The removal process on a ball only needs to deal with its border nodes and their affected nodes.
Optimization Techniques
P4
P3P2P1P
P’
Q G
P4
P3P1
3. Connectivity pruning. In a ball, only the connected component containing the ball center v needs to be considered. Hence, those nodes not reachable from v can be pruned early.
Optimization Techniques
A2B1A1 B2
Q
CB1A1 A2 B2
G
def hhk (g: Graph, q: Graph): Unit = {
val sim = HashMap[Int, Set[Int]]()
q.vertices.foreach ( u => {
var lis = Set[Int]()
g.vertices.filter( w => g.label(w) == q.label(u)).foreach ( wp => lis += wp )
sim += u -> lis
})
var flag = true
while (flag) {
flag = false
for (u <- q.vertices; w <- sim(u); v <- q.post(u) if (g.post(w) & sim(v)).isEmpty ) {
sim(u) -= w
flag = true
}
for (u <- q.vertices; w <- sim(u); v <- q.pre(u) if (g.post(w) & sim(v)).isEmpty ) {
sim(u) -= w
flag = true
} //for
} //while
}
For all v € G
If post (v) =0 then
sim(v) = { u € Q | <<u>> = <<v>>}
Else
sim(v) = { u € Q | <<u>> = <<v>> and post (u) ≠ 0}
Remove (v) := pre ( G) – pre (sim(v))
While there is v € G , remove(v) ≠ 0
for all u € pre(v)
for all w € remove (v)
if w € sim (u)
sim (u) = sim (u) – {w}
for all w’ € pre (w)
if post(w’) ᴨ sim (u) = 0 then remove (u) := remove (u) ᴜ {w’}
remove (v) = 0A1
C1 B1 D1
H G
C2
C3
D2
A
B C D
Sim ( D) = { D1,D2} Remove (v) := pre ( G) – pre (sim(v))
Remove (D) = {A1,B1,C1,D1,C2,C3} – {C2,C3,A1,B1} = {C1,D1}For u -> Pre(D) = { C,A} for w -> Remove (D) = {C1,D1} if w € sim(C) = {C1,C2,C3,A1} => sim (C) = {C1,C2,C3}–{C1} for all w’ € pre (w) = {A1} if post(A1) ᴨ Sim(C) = {C2,C3} ==0 (False)
Home work:
Pattern Q is looking for papers on social networks (SN) cited by papers on databases (db), which in turn cite papers on graph theory (graph). Fined the pattern graph and all Isomorphism, Simple simulation, Dual simulation and strong simulation match graph of that with given graph G
Graph Simulation
DB1 DB2 DB3 SN3
SN1 Graph1 SN2 Graph2 SN4
The balance constraint:◦ Balance computational load such that each processor has the same execution
time◦ Balance storage such that each processor has the same storage demands
Minimum edge cut:◦ Minimize communication volume between subdomains, along the edges of the
mesh
Goals of Partitioning
Example 3-way partition with edge-cut = 9
5-cut
4-cut
We now define the graph pattern matching problem in a distributed setting. Given pattern graph Q, and fragmented graph F = (F1, . . ., Fk) of data graph G, in which each fragment
Fi = (G[Vi], Bi) (i [1, k]) is placed at a separate machine Si, the distributed graph pattern ∈matching problem is to find the maximum match in G for Q, via graph simulation.F1 = (G[V1], {BPM1 , BSA1 }), V1 = {PM1, BA1}
BPM1 = {BA1 : 2}, BSA1 = {SD1 : 2},
F2 = (G[V2], ), V2 = {SA1, ST1}, ∅
F3 = (G[V3], {BPM2}), V3 = {PM2,BA2,UD1},
BPM2 = {SA2 : 4} and BSA2 = {SDh : 5},
F4 = (G[V4], {BSA2 }), V4 = {SA2},
F5 = (G[V5], ), V5 = {SD1, ST1, . . . , SDh, STh},∅
Distributed Graph Pattern Matching
SA
PM
BA UD
SD ST
PM1
SA1
BA1
SD1
PM2
BA2 UD1
SA2
STnSDnST1SD1
F1 F2 F3 F4
F5
Partial match. A binary relation R Vq × Vi is said to be a partial match if ⊆ (1) for each (u, v) R, u and v have the same label; ∈ (2) for each edge (u, u’) in Eq,
◦ (a) there exists a node v’ Bv in Bi having the same label as u’ if v is a boundary node∈◦ (b) there exists an edge (v, v’) in G[Vi] such that (u’, v’) R∈
Pair (SA, SA1) is in the maximum partial match PM1 in fragment F1 for Q. However, it does not belong to the maximum match M in G for Q. Consider pattern graph Q1 and data graph G1 , and the partial match results .
(1) For node SA1, its only child SD1 is located in fragment F2. The partial match SD1 is empty. Hence, a false match decision is sent back to machine S1, and this further helps determine that (SA,SA1) is a false match.
(2) For node SA2, its only child SDn is located in fragment F5. The subgraph F5 contains no boundary nodes, and SDn belongs to F5. Hence, a true match decision is sent back to machine S4, and this further helps determine that (SA,SA2) is a true match. After these are done, fragment F3 is the only part of G that needs to be further evaluated. To check the matches in F3, we simply ship fragment F4 to machine S3.
Distributed Graph Pattern Matching
SA
PM
BA UD
SD ST
PM1
SA1
BA1
SD1
PM2
BA2 UD1SA2
STnSDnST1SD1
F1 F2 F3 F4
F5
Go for each matched label vertex and create the ball. with d=4 (L=2)
Q G
40
6050
D
E F
1
2
A
B
4
3
6 7
D
C
F
5
D
E
1110
98F
F
E
E
D, E, F
12
D
D
6 7
F
5
D
E
4
1110
98
E
E
F
F
D 4D
1110
8F
F E
12
D
9E
Introducing Akka
Correct highly scalable systems.Fault tolerant system that self heals.Truly scalable systems. ………. Using state of the art tools.
The Problem
…. Simpler Concurrency Scalability Fault Tolerance
With a single unified Programming Model Runtime Service
Vision
Scale up & out
Finance• Stock trend analysis and simulation.• Event Driven Messaging Systems. Betting and Gaming• Massive multiplayer online gaming• High throughput and transactional betting. Telecom• Streaming media network gateways. Simulation• 3 D Simulation Engine. Ecommerce• Social Media Community Sites.
Where is Akka used?
In computer science, the Actor model is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent digital computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.
What is “Actor Model”
AKKA is a toolkit and runtime for building highly concurrent distributed and fault tolerant even driven application on the JVM
Parallism Concurrency Event
DrivenActor
Behavior
State
Life cycle of a Actor
class object Tickclass Counter extends Actors {Var counter =0Def receive ={
Case tick =>Counter += 1Println (counter)}
}
Actors
Val counter = actorOf[Counter]
Counter is an ActorRef
Create Actors
Counter ! tick
Send !
val future=actor !!! Messagefuture.awaitval result = future.result
Send !!!
Class SomeActor extends Actor { def receive = { Case User(name) => Self.reply("Hi" + name) }}
Reply
Self become{Case NewMessage =>
…….}
Hot Swap
There are four different types of message dispatchers:1. Thread-based2. Event-based3. Priority event-based4. Work-stealing
Message Dispatcher
The ‘ThreadBasedDispatcher’ binds a dedicated OS thread to each specific Actor. The messages are posted to a ‘LinkedBlockingQueue’ which feeds the messages to the dispatcher one by one. A ‘ThreadBasedDispatcher’ cannot be shared between actors. This dispatcher has worse performance and scalability than the event-based dispatcher but works great for creating “daemon” Actors that consumes a low frequency of messages and are allowed to go off and do their own thing for a longer period of time. Another advantage with this dispatcher is that Actors do not block threads for each other.
ThreadBasedDispatcher
The ‘ExecutorBasedEventDrivenDispatcher’ binds a set of Actors to a thread pool backed up by a ‘BlockingQueue’. This dispatcher is highly configurable and supports a fluent configuration API to configure the ‘BlockingQueue’ (type of queue, max items etc.) as well as the thread pool.
The event-driven dispatchers must be shared between multiple Actors. One best practice is to let each top-level Actor, e.g. the Actors you define in the declarative supervisor config, to get their own dispatcher but reuse the dispatcher for each new Actor that the top-level Actor creates. But you can also share dispatcher between multiple top-level Actors.
ExecutorBasedEventDrivenDispatcher
import akka.actor.Actorimport akka.dispatch.Dispatchers class MyActor extends Actor { self.dispatcher = Dispatchers.newExecutorBasedEventDrivenDispatcher(name) .withNewThreadPoolWithLinkedBlockingQueueWithCapacity(100) .setCorePoolSize(16) .setMaxPoolSize(128) .setKeepAliveTimeInMillis(60000) .build ...}
It’s useful to be able to specify priority order of messages, that is done by using PriorityExecutorBasedEventDrivenDispatcher.
PriorityExecutorBasedEventDrivenDispatcher
import akka.dispatch._
import akka.actor._
val gen = PriorityGenerator { // Create a new PriorityGenerator, lower prio means more important
case 'highpriority => 0 // 'highpriority messages should be treated first if possible
case 'lowpriority => 100 // 'lowpriority messages should be treated last if possible
case otherwise => 50 // We default to 50
}
val a = Actor.actorOf( // We create a new Actor that just prints out what it processes
new Actor {
def receive = {
case x => println(x)
}
})
// We create a new Priority dispatcher and seed it with the priority generator
a.dispatcher = new PriorityExecutorBasedEventDrivenDispatcher("foo", gen)
a.start // Start the Actor
The‘ExecutorBasedEventDrivenWorkStealingDispatcher’ is a variation of the ‘ExecutorBasedEventDrivenDispatcher’ in which Actors of the same type can be set up to share this dispatcher and during execution time the different actors will steal messages from other actors if they have less messages to process. This can be a great way to improve throughput at the cost of a little higher latency.
ExecutorBasedEventDriven WorkStealingDispatcher
Scratch Data
Static Data• Supplied At boot time.• Supplied by other components.
Dynamic Data• Data possible to recompute.• Data from other sources.
Classification of State
Fault Tolerant (Onion Layered)
Error Kernel Pattern
Akka is a implementation of Actor Model for both java and scala.
Actor encapsulates mutable state with the guarantee of one message at a time.
Akka in Summary
Assign each child the label matched ball
Union All Matches
Graph theory http://en.wikipedia.org/wiki/Graph_theory Capturing Topology in Graph Pattern Matching http://
vldb.org/pvldb/vol5/p310_shuaima_vldb2012.pdf Making a Move of Graphs via Probabilistic Voting http://
cv.snu.ac.kr/publication/conf/2012/ProgGM_CVPR2012.pdf GPS: A Graph Processing System http://ilpubs.stanford.edu:8090/1039/5/full_paper.pdf Distributed Graph Pattern Matching http://
www2012.wwwconference.org/proceedings/proceedings/p949.pdf Pregel: A System for Large-Scale Graph Processing new-chinese-chess-engine.googlecode.com/svn-history/r21/trunk/search_engine
/doc/arch/arch/pregel_paper.pdf Akka 2.0: Scaling Up & Out With Actors https://
www.youtube.com/watch?v=3jbqTxstlC4&feature=relmfu Apache Giraph: distributed graph processing in the cloud https://
www.youtube.com/watch?feature=endscreen&v=BmRaejKGeDM&NR=1 MapReduce Used on Large Data Sets https://www.youtube.com/watch?v=N8FHXgPJEfQ
References
The Actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages, but differs in that object-oriented software is typically executed sequentially, while the Actor model is inherently concurrent.
The Actor Model instead of manually creating threads or event loops, creates an object that has state and associated logic, and this associated logic will only be called with one thread at the time and communicate with outside through messages.
An actor is a computational entity that, in response to a message it receives, can concurrently:
send a finite number of messages to other actors; create a finite number of new actors; designate the behavior to be used for the next message it receives.
Recipients of messages are identified by address, sometimes called "mailing address". Thus an actor can only communicate with actors whose addresses it has. It can obtain those from a message it receives, or if the address is for an actor it has itself created.
Actor model
Create
Case object Tick
Class Counter extends Actor{
var counter = 0
Def receive = {
Case Tick => counter +=1
Println(counter)
}
} Create an instance of counter Actor in system and give you the reference handle back
var counter = system.actorOf ( Props [ Counter ] . name = “ conunt”)
or if we are inside of parent and want to create child:
var counter = Context.actorOf(….)
To stop
Counter.stop
Actors in AKKA
Define how to create an actor
Name of the actor in the hierarchy
Send Message
Counter ! Tick
(send Tick message to counter method -> put it in mail box)
Reply
class SomeActor extends Actor{
def receive = {
case User(name) => sender ! (“Hi” + name)
}
} To Change behaviour
self become{
case NewMessage => …..
}
Actors in AKKA
Failure Strategy
class MySupperVision extends Actor{
def supervisionStratogy = OneForOneStratogy({
case_ : ActorKilledException => Stop
case_ : ArithmaticException => Resume
case_ : Exception => Restart
}, maxNrOfRetries = None , with in Time Range = None)
def recive = {
case NewUser(name) =>
….. = context.actorOf[User] (name)
Actors in AKKA
Remoting
remote actors have a different kind of ActorRef
akka{
actor{
provider = akka.remote.RemoteActorRefProvider
deployment{
counter{
remote = akka://mysystem@hostname:255
}
}
}
}
Actors in AKKA
https://www.youtube.com/watch?v=3jbqTxstlC4&feature=relmfu https://
www.youtube.com/watch?feature=endscreen&v=BmRaejKGeDM&NR=1 https://www.youtube.com/watch?v=UY3fuHebRMI https://www.youtube.com/watch?v=N8FHXgPJEfQ