a p2p-based storage platform for storing session data in internet access networks
DESCRIPTION
Peter Danielis , M. Gotzmann , D. Timmermann University of Rostock, Germany Institute of Applied Microelectronics and Computer Engineering. A P2P-based Storage Platform for Storing Session Data in Internet Access Networks. - PowerPoint PPT PresentationTRANSCRIPT
A P2P-based Storage Platform for Storing Session Data in Internet Access Networks
T. Bahls, D. Duchow
Nokia Siemens Networks
Broadband Access Division
Greifswald, Germany
World Telecommunication Congress 2010
Network & Service Management ReliabilitySeptember 13-14
Peter Danielis, M. Gotzmann, D. Timmermann
University of Rostock, Germany
Institute of Applied Microelectronics
and Computer Engineering
Outline
Introduction & Motivation
Utilization of P2P Technology
Erasure Resilient Codes for High Data Availability
Realization of the P2P-based Storage Platform
Summary
2
Introduction & Motivation
Internet Service Providers (ISPs) provide Internet access
Access nodes (ANs) = essential network elementsE.g., DSLAMs (Digital Subscriber Line Access Multiplexers)
3
AN 1
AN 2
AN 3
AN 4
Customers
Internet
Introduction & Motivation
Access nodes (ANs) = essential network elements
ANs have to be powerful but well-priced ANs ≠ servers!Budget with available resources!
4
AN 1
AN 2
AN 3
AN 4
Customers
Internet
$
$
$
$
Introduction & Motivation
Access nodes (ANs) = essential network elements
ANs need resets (or may fail) data must not be lost!AN configuration data needs to be saved persistently!But there‘s more…
5
AN 1
AN 2
AN 3
AN 4
Customers
Internet
Introduction & Motivation
Data - called session data - …… comprises MAC/IP addresses, IP lease times of customers… is required for data forwarding/traffic filtering
6
AN 1
AN 2
AN 3
AN 4
Customers
Internet
MAC address: 00-50-04-E1-15-A0IP address:139.30.201.254Lease Time: 60 minActive: No
DHCP Request:I have MAC address00-50-04-E1-15-A0!
DHCP Response:Your IP address is139.30.201.254 for 60 min!
Introduction & Motivation
Data - called session data - …… comprises MAC/IP addresses, IP lease times of customers… is required for data forwarding/traffic filtering… has to be always available persistent storage needed… is highly volatile due to continous changes
7
AN 1
AN 2
AN 3
AN 4
Customers
Internet
MAC address: 00-50-04-E1-15-A0IP address:139.30.201.254Lease Time: 60 minActive: Yes
DHCP Request:I have MAC address00-50-04-E1-15-A0!
DHCP Response:Your IP address is139.30.201.254 for 60 min!
Introduction & Motivation
Today: ANs store session data in persistent flash memory
Problem: Flash memory limited availability/rewritability
ISPs „sacrifice“ flash memory for session data reluctantly
8
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
Today: ANs store session data in persistent flash memory
Problem: Flash memory limited availability/rewritability
Solution: Use available volatile RAM resources of ANs!
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
RAM(Volatile)
Introduction & Motivation
9
Average AN, e.g., PowerQuicc III (Freescale Semiconductor)
RAM capacity = 1 Gbyte + unlimited rewritability
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
RAM(Volatile)
40%free
Introduction & Motivation
10
Average AN, e.g., PowerQuicc III (Freescale Semiconductor)
Calculating capacity = 1234 Dhrystone MIPS
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
CalculatingCapacity
RAM(Volatile)
40%free
Introduction & Motivation
11
Average AN, e.g., PowerQuicc III (Freescale Semiconductor)
Calculating capacity = 1234 Dhrystone MIPS
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
CalculatingCapacity
RAM(Volatile)
40%free
40%free
Introduction & Motivation
12
Average AN, e.g., PowerQuicc III (Freescale Semiconductor)
Problem: How to efficiently utilize available resources?
Access AreaCustomers Area Core Network Area
Access Node,e.g., DSL Access Multiplexer
CustomersBroadband Remote
Access Server
Internet Service Provider
Flash Memory(Persistent)
Internet
CalculatingCapacity
RAM(Volatile)
40%free
40%free
Introduction & Motivation
13
Outline
Introduction & Motivation
Utilization of P2P Technology
Erasure Resilient Codes for High Data Availability
Realization of the P2P-based Storage Platform
Summary
14
What options does P2P offer?
...beyond the incriminated applications, of course.
New networking paradigmNo clients and servers anymore
15
...beyond the incriminated applications, of course.
New networking paradigmNo clients and servers anymoreAll peers form a self-organizing networkNetwork = storage resourceNetwork = computing resource
Scalability and resilience = intrinsic featuresProven concept (BitTorrent, Zattoo, Joost)
What options does P2P offer?
16
Networking paradigmEach AN is part of a logical P2P overlay on its uplink
Network = Storage ResourceEach AN stores just a piece of session data
Network = Computing ResourceEach AN implements P2P protocol
But ANs may become unavailable…
Problem: How to ensure high data availability?
Utilization of P2P technology
17
Storage Capacityof ANs
Outline
Introduction & Motivation
Utilization of P2P Technology
Erasure Resilient Codes (ERCs) for High Data Availability
Realization of the P2P-based Storage Platform
Summary
18
ERCs for High Data Availability
Objective: High session data availability = 99.999 %
Simple replication wastes memory ressources
Reed-Solomon CodesSplit session data of each AN into m data chunks
19
m Session Data Chunks
Split
Session Data
ERCs for High Data Availability
Objective: High session data availability = 99.999 %
Simple replication wastes memory ressources
Reed-Solomon CodesSplit session data of each AN into m data chunksEncoding: Add k interleaved coding chunks n=m+k chunks
20
Encoding
k Coding Chunksm Session Data
Chunks
Split
Session Data
ERCs for High Data Availability
Objective: High session data availability = 99.999 %
Simple replication wastes memory ressources
Reed-Solomon CodesSplit session data of each AN into m data chunksEncoding: Add k interleaved coding chunks n=m+k chunksDecoding: Restore session data from any m of n chunks
21
Decoding
n = m+k Data-/CodingChunks, plus Erasures Session Data
Outline
Introduction & Motivation
Utilization of P2P Technology
Erasure Resilient Codes for High Data Availability
Realization of the P2P-based Storage Platform
Summary
22
Kad-based Realization
23
AN 1
AN 2
AN 3
AN 4
Customers
Internet
Kad-based Realization
Connection of access nodes (ANs) with P2P-based overlay
24
Internet
AN 1
AN 2
AN 3
AN 4
Customers
Kad-based Realization
Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring
25
AN 1
AN 2
AN 3
AN 4
Customers
Logical P2P network on top of real topology: Kad-based DHT ring
AN 1
AN 2
AN 3
AN 4
Customers
Logical P2P network on top of real topology: Kad-based DHT ring
Chunk of AN 2
Chunk of AN 3
Chunk of AN 1Chunk of AN 1
Chunk of AN 2
Chunk of AN 2
Chunk of AN 3
Session Data ChunksSession Data Chunks
Session Data Chunks Session Data Chunks
Chunk of AN 4
Chunk of AN 4
Kad-based Realization
Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring
Structured chunk storage via DHT ringAssignment of hash values to ANs and session data chunksANs save session data chunks with similar hash values
26
AN 1
AN 2
AN 3
AN 4
Customers
Logical P2P network on top of real topology: Kad-based DHT ring
Chunk of AN 2
Chunk of AN 3
Chunk of AN 1Chunk of AN 1
Chunk of AN 2
Chunk of AN 2
Chunk of AN 3
Session Data ChunksSession Data Chunks
Session Data Chunks Session Data Chunks
Chunk of AN 4
Chunk of AN 4
Kad-based Realization
Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring
Structured chunk storage via DHT ringAssignment of hash values to ANs and session data chunksANs save session data chunks with similar hash values
27
Admin
Block Diagram
The main components are…
28
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
DHCP Server
Block Diagram
(1) module with controlling functionality
29
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
1
Save Session Data!
Time to Save Session Data!
DHCP Server
Block Diagram
(2) memory with own session data
30
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
2
DHCP Server
Block Diagram
(3) Kad block with ERC functionality
31
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
3
DHCP Server
Block Diagram
(4) routing table
32
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
4
DHCP Server
Block Diagram
(5) memory with session data chunks of other nodes
33
SessionData Chunks
(of other nodes)
Routing Table
KadPacket/
DataTransfer
Result from Kad lookup
ControllingFunctionality Operation
External control
Kad Functionality
ERC Functionality
Own Session
Data
Get/Save Data
Get/Save Data
5
DHCP Server
Outline
Introduction & Motivation
Utilization of P2P Technology
Erasure Resilient Codes for High Data Availability
Realization of the P2P-based Storage Platform
Summary
34
Summary
Successful development of P2P-based storage platformUtilization of free RAM instead of rarely available flash memory
Connection of access nodes by P2P overlayHigh scalability and resilience towards network errors
Efficient sharing of RAM and computing resources
ERCs for high data availability & low redundandy
Completion of fully functional prototype
35
Backup: Related Work
37
J. Kubiatowicz et. al., “Oceanstore: An architecture for
global-scale persistent storage”, 2000
Schwarz, Xin, Miller, “Availability in Global Peer-To-Peer
Storage Systems”, 2004
Sattler, Hauswirth, Schmidt, „UniStore: Querying a DHT-
based Universal Storage“, 2007
Morariu, „DIPStorage: Distributed Storage of IP Flow
Records“, 2008
Backup: Kad-based DHT
38
02m-1611
1008
1622
2011X
Y
2207
Address space as ring (overlay)
Address space(Hash values)
H(Peer Y) = 3485
H(Peer X) = 2906
Data “D“H(“D“) = 3107
3485 - 610
611 - 1007
1008 - 1621
1622 - 2010
2011 - 2206
2207 - 2905
2906 - 3484
(3485 – 610)
D
Kad (eMule): 128 bit address space
Distances between hash values are calculated by the XOR metric
Binary tree with XOR distances of other peers to itself
Organized into k-bucketsEach peer knows many close peersEach peer knows only few distant peersEach peer has a life time
39
Backup: Kad Routing Table
1111 00004 Bit address space
1 0
1 0
1 0
1 0
Backup: Kad Bootstrapping & Maintenance
BootstrappingNew peer contacts a known peer and inserts itself on ring
MaintenanceContact peers from routing table with expired life timeContact other peers periodically to learn new contacts
40
Backup: Kad Lookup Process
Searching peer selects peers close to target
41
These peers are contacted via a request
Some respond with new peers
Searching peerPossible contactsfrom routing table
Target
00..00 11..11Searching tolerance
128 bit address space
REQ
REQREQ
RES
RES
Backup: Kad Lookup Process
Some of the new peers are contacted
Some of them respond
42
New possible contacts
REQ
REQ
REQ
RES
RES
Responding peers within a defined search tolerance
Action request: Execute the action!
If they send an action response, a counter is increased
If counter==defined value, the lookup terminates
Otherwise, it is terminated via a timeout
43
Backup: Kad Lookup Process
ACTION REQ
ACTION RESanswers++
REQREQ
REQ
Backup: Prototype
44
Customers
P2P Functionality
Access Node
DHCP
Administrator (ISP)
Administer
Start/Stop
No DHCP Traffic
Session F
ilter
No Kad Traffic
Configure
Get/SaveData
Indicate Changes
ExternalControl
DHCP Server
KadTraffic/Data
Transfer
ConfigureControlModule
Backup: Related Issues
45
Benefit from using ERCs instead of data replication
Moderate quantitative memory savings
But significantly higher data availability
Kad network: open source is high quality!
Minimal traffic overhead introduced by Kad maintenance
Backup: Memory requirements & performance
46
Currently, prototype is ported to a Xilinx FPGA board
Long-time test/simu of prototype at our institute intended
Functional verification
Determination of performance
Determination of memory requirements
Determination of CPU utilization