network information processing
DESCRIPTION
Presented at UofR, Canada (Network Information Processing Course).TRANSCRIPT
A Quick Safari Through Network A Quick Safari Through Network Information ProcessingInformation Processing
Reza Rahimi,Software Engineering Systems,
University of Regina, Canada.
Problem Formulation Problem Formulation In NetworkingIn Networking
Main Problem: Main Problem: In a given network we want to transfer data from the group of sources to the group of destinations.
Constraints:Constraints: Network Topology and Architecture (in abstract level:
Directed Graph, Undirected Graph, Special Family of Graph( Trees, Mesh, Layered Graphs, Random Graphs, Geometric Graphs,…).
Physical Constraints ( capacity of the link, power constraint, noise,…).
Optimization Metrics:Optimization Metrics: Maximum amount of information into terminals
(Internet). Energy (Wireless Sensor Networks). Delay (Internet Telephone). Load Balancing (almost important in every networks). Fault Tolerant (specially in wireless networks).
It also can be divided into the 3 main sub problems for simplicity: Unicast:
We consider the transfer of data from one source to one destination.
Multicast: We consider the transfer of data from one source to group of
destinations but not all of the nodes.
Broadcast: We consider the transfer of data from one source to all of the
entire nodes in the network.
The main problem can be formulated in general using optimization methods, and sometimes can be solved in centralized or distributed manners at least in theory.
In many cases, the optimization approach will give us Integer Optimization which is generally NP-Hard problem. We should use relaxation to make it traceable.
Using another methods will usually give us much better insight and algorithms for the problem.
In this note I try to consider this problem according to the Maximum Amount of Information that Could be Transferred metric.
We will investigate the theoretical bounds for the problem and consider different techniques for achieving it.
Network Information Network Information ProcessingProcessing
Assumptions: Almost every time we consider directed acyclic
graphs (DAG). We assume that the capacity of each edge is
one unit.( one can easily converts each integer weighted graph to the normalized graph).
1
1
1
1
2
1
1
OR
Question1: What is the Maximum Amount of
Information could be transferred in Unicast scenario?
Question2: Is this Maximum Amount Traceable with
Deterministic, Randomized, or Distributed Algorithms?
Maximum-Flow Min-Cut Theorem Flow Network: A Flow network is a Finite
Directed Graph (not necessarily acyclic) G=(V,E) with the following features: Each edge e has positive capacity There is one single source. The is one terminal or destination source.
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
0ec
Flow Function: S-t Flow function is f:ER which has the following properties: It must be positive and should not exceed the capacity of each
edge.
��For each node except for s and t sum of the input flow must be equal to some of the output flow (Physical Law: ex. Information Conservation).
Flow Value: amount of information that enters into destination
node.
ecef )(0
Flow value = 12
capacity
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
5
3
3
7 7 9
2
0 0
flow
vofout vinto
)()( t}{s,-Vvee
efef
Question1: What is the maximum amount of information flow Question1: What is the maximum amount of information flow achievable in this network?achievable in this network?
First Attempt: Using LP to compute the amount in polynomial time (if integer valued are allowed it will be NP-Hard).
Second Attempt: Heuristic Methods Algorithm(G,s,t)
Assign the initial flow to zero. For every simple path from s to t in Graph G
(Greedily) push positive flow on with respect to constraints. update the flow.
S D
20/20
20/30
0/10
0/10 20/20
Flow Value = 20
S D
20/20
10/30
10/10
10/10 20/20
Flow Value = 30
=Max Flow
But How can we correct the previous algorithm? Suppose we made push forward in one path but
maybe our choice was not suitable so we put it on mind and write the reverse path.
With collecting this information, we get the second graph which is called Residual GraphResidual Graph.
S D
20/20
20/30
0/10
0/10 20/20
S D20
10
10 20
20
10
G: Gf:
So we can edit the previous algorithm as below: Ford-FulkersonFord-Fulkerson Method (G,s,t):
Start with zero flow. While there is a simple path between source and
destination in residual graph Gf : Push flow in it and update the flow function.
Lets consider one example graphically:
G:
Flow value = 0
capacity
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
0
0
0
0 0 0
0
0 0
flow
s
2
3
4
5 t 10 9
4
10 6 2
Gf:10 8
10
residualcapacity
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
8
0
0
0 0 8
8
0 0
G:
s
2
3
4
5 t 10
4
10 6 Gf:
8
8
8
Flow value = 8
9
22
2
10
210
X
X
X2X
0
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
10
0
0
0 2 10
8
2
G:
s
2
3
4
5 t
4
2
Gf:
10
8
Flow value = 10
10
2
10 7
10 6
X
66
6
X
X
8X
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
10
0
6
6 8 10
8
2
G:
s
2
3
4
5 t1
6 Gf:
10
8
Flow value = 16
10
8
6
6
6
4
4
4
2
X
8
2
8
X
X
0X
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
10
2
8
8 8 10
8
0
G:
s
2
3
4
5 t
6 2
Gf:
10
Flow value = 18
10
8
6
8
8
2
2 1
2
8 2
X
9
7 9
X
X
9X
X 3
s
2
3
4
5 t 10
10
9
8
4
10
10 6 2
10
3
9
9 9 10
7
0
G:
s
2
3
4
5 t 1 9
1
1 6 2
Gf:
10
7
Flow value = 19
10
6
9
9
3
1
Cut: s-t cut is a portion of the vertex set V into sets A and B such:
Cut Capacity: The capacity of and s-t cut denoted by :
BAVBA
BtAs
,
,
AofouteecBAc
),(
And finally we have the famous Max-Flow Min-Cut Theorem:
Max-Flow Min-Cut Theorem:
In every flow network the Ford-Fulkerson In every flow network the Ford-Fulkerson method Reaches the graph maximum flow method Reaches the graph maximum flow
and it is equal to minimum cut capacity.and it is equal to minimum cut capacity.
There are several Polynomial Time Algorithms suggested for this problem. The following table shows some of the famous ones.
SoSo wewe cancan reachreach thethe maximum maximum information transferring with information transferring with routingrouting (only with forwarding)(only with forwarding) inin polynomial time polynomial time inin Unicast Unicast
Scenario.Scenario.
Maximum Information Maximum Information Transferring in Transferring in
Multicasting ScenarioMulticasting Scenario
What is the maximum amount of information What is the maximum amount of information that could be transferred in multicasting that could be transferred in multicasting scenario?scenario?
The following graph shows the basic idea for multicasting.
Super Terminal
∞
∞
Insert super nodeand use max-FlowMin-Cut Theorem.
So we can not exceed this bound. Now another question arises:
How can we make much more diversity of How can we make much more diversity of independent packets in each independent packets in each
destination?destination?
Simple Routing with Forwarding
Packet DuplicationRouting with Duplicate and Forward
Lesson That we have learned:
With the usage of some functions in With the usage of some functions in each routing node, we could get each routing node, we could get
much more diversity of information much more diversity of information in each terminal nodes.in each terminal nodes.
Duplicate Duplicate
R+B
Duplicate
B,R+B
RR+B
Routing with Addition and Subtraction
In general we can model this technique as below (Linear OperationLinear Operation):
Note that one can not achieve more that Note that one can not achieve more that max flow for each terminal (Upper Bound).max flow for each terminal (Upper Bound).
x
z
ya
b
232221
131211
b
a
z
y
x
232221
131211
The previous technique is divided into two categories: Duplicate and Forward (Routing).Duplicate and Forward (Routing). Network Coding.Network Coding.
The first strategy is something that is used in current networking technology.
The second one may be used in near future.
Duplicate and Forward Duplicate and Forward SterategySterategy
It is obvious that if we let duplication a packet path would be treetree in DAG graph.
So we could formulate follows:
Packing Trees for getting Maximum Throughput in each terminal node.
There are some points about this formulation. Generally the number of trees are exponential
according to the size of input graph. If we consider only integer values it will be
Linear Integer Programming.
So where is the exact location of the tree packing problem in polynomial time hierarchy?
It can be proved that this problem is NP-Hard.
So it seems that in general the problem is hard.
Let’s simplify the problem a bit to see if it will be traceable.
Lets assume that we want to pack tree in a way that all of the terminals get the same number of colors.
It is obvious that the number of colors could not exceed than min max-flow (s,T).
Unfortunately this version again is not traceable.
It is equal to Packing Steiner TreesPacking Steiner Trees which is NP-Hard. Generally there is Generally there is nono PolynomialPolynomial TimeTime
AlgorithmsAlgorithms that we could optimally that we could optimally transfer packets with only transfer packets with only DuplicateDuplicate and and ForwardForward strategy in strategy in Multicasting (P≠NP).Multicasting (P≠NP).
Now if we empower each node with complete linear operation what will happen? (switching to network coding).
Linear Network Coding Linear Network Coding
In MulticastingIn Multicasting
We are working in GF(2GF(2qq)) field and assuming each packet is in this field.
All mathematical calculation is valid like real number field.
Just like previous session we assume that the graph is DAG.
There is no delay in each node for scrambling inputs to make outputs.
For Inputs we use XX variable, for intermediate Nodes YY and for the output signals to be recovered, ZZ.
x1
x2
xn
y(e1)
y(em)1
1
1
1
11
111
)(
.
)(
.
.
...
.
mmnnnmemem
nee
ey
ey
x
x
1
1
1
*
1*
11
)(
.
)(
)(
.
)(
.
...
.
*1
*
*1
*
mmnn
nmemeeme
eeee
ey
ey
ey
ey
n
n
y(e*1) y(e1)
y(em)y(e*2)
y(e*n)
z1
zn
y(e1)
y(e2)
y(em)1
1
1
1
,,1
1,1,1
.
)(
.
)(
.
...
.
nnmmmnnemne
eme
z
z
ey
ey
Type of nodes and their input-output relationType of nodes and their input-output relation
Conversion
v1
v2
v3
v4
e1
e5
e6
e2
e3
e7
x1
x2
x3
z1
z2
z3
e4
e1
e2
e3
e5
e4
e6
e7
x1
x2
x3
z1
z2
z3
It seems that each edge plays muchimportant rule
than nodes so we convert the original
to the new graphwhich each node
stands as the edge of the previous one.
e1
e2
e3
e5
e4
e6
e7
x1
x2
x3
z1
z2
z3
βe1,e5
βe1,e4
βe3,e7
βe2,e5
βe4,e7
βe4,e6βe2,e4
βe3,e6
α1,e1
α1,e2
α1,e3
α2,e1
α3,e1
α2,e2
α3,e2
α2,e3
α3,e3
εe5,3
εe5,1
εe5,2
εe6,1
εe7,1
εe6,2
εe7,2
εe6,3
εe7,3
0000000
0000000
0000000
00000
00000
00000
00000
7,46,4
7,36,3
5,24,2
5,14,1
eeee
eeee
eeee
eeee
F
Internal Matrix:
3,73,63,5
2,72,62,5
1,71,61,5
0000
0000
0000
eee
eee
eee
B
Output Matrix:
0
0
0
000
000
000
3,32,31,3
3,22,21,2
3,12,11,1
eee
eee
eee
A
Input Matrix:
Question: How we can relate inputs and Question: How we can relate inputs and outputs using these Matrices?outputs using these Matrices? It is obvious that A shows the inputs inject into
the network and the same, B shows that how network information inject into outputs.
How can we get the propagation in the network?
We must find all walk between source edges and output edges.
It can be proved easily, according to some algebraic graph theory algorithms that:
T
i
i BFAxz0
We can simplify the previous equation by the following assumption. If we make the graph in topological order then we will
get the simpler equation:
And finally with some more challenges with have the famous network coding theorem:
TBFIAxz 1
Consider a DAG G with unit capacities Consider a DAG G with unit capacities that has a single source node s (with h that has a single source node s (with h sources) and a set of terminal nodes T. sources) and a set of terminal nodes T. The multicast property with rate h is said The multicast property with rate h is said to be satisfied if max-Flow (s,Ti) ≥ h for to be satisfied if max-Flow (s,Ti) ≥ h for all Ti. If G satisfied the multicast property all Ti. If G satisfied the multicast property a network code that supports the a network code that supports the multicast rate h is guaranteed to exist as multicast rate h is guaranteed to exist as long as the field size is larger than |T |.long as the field size is larger than |T |.
So if the field size is large enough there always exists network coding scheme that reaches the limit.
The are some polynomial time algorithms suggested for making network codes, for example LIF and Randomized Network Coding Algorithms.
For some special graphs with network coding we could reach the maximum flow for each node.
So in summary we have:
With network coding we can With network coding we can reach the maximum throughput reach the maximum throughput
in polynomial time.in polynomial time.
Comparison between two Comparison between two methods in Multicasting methods in Multicasting
ScenarioScenario
What is the theoretical Gap between What is the theoretical Gap between Network Coding and Routing?Network Coding and Routing?
It can be proved that if the graph is directed the gap is very large (Ω(logn): where n is the number of terminals).
But if the graph is not directed the gap is in the order of constant number.
Network Coding Example
Suppose the following Directed Graph: Gha,b
Lemma: Under routing the capacity of the Gh
2h,C(2h,h) is less than 2. with network coding the capacity of the
network could be h. with some error control coding codes
we can get the maximum capacity for network coding.
Example Reed-Solomon Codes:
nkkk
nkn
nn
n
kk
MC
CCCC
qGF
MMMM
knRSolomoned
12
11
1
2111
1101
i21
1101
...
......
...
1...11
],...,,[
.0, )(,...,,
],...,,[
::),( Re
The structure of the above matrix is VandermondeAnd with any h subset of the Codeword we can makethe original message.
So in the source node we can use RS(a,h) and in the terminals the original signal can be made.
This concept is sometimes categorized as the source coding.
Maximum Information Maximum Information Transferring in Transferring in
Broadcasting ScenarioBroadcasting Scenario
In Broadcasting according to the Edmond’s paper we can always pack k- edge disjoint spanning trees where k=min max-flow (s,Ti).
So in this scenario, routing with duplication has the same power as network coding in general case.
Conclusion
The basics of routing and its theoretical bounds are reviewed.
The basics of network coding and its theoretical bounds are reviewed.
It seems that in general network coding gives us much more throughput, but contains more computational complexity than general routing.
Unicast Multicast BroadcastNetwork Coding, Routing
The same as each
other.
The performance of Network Coding is much better and to use routing we face NP-Hard Problem.
The same as each other.