a p2p-based storage platform for storing session data in internet access networks

Post on 07-Jan-2016

32 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Peter Danielis , M. Gotzmann , D. Timmermann University of Rostock, Germany Institute of Applied Microelectronics and Computer Engineering. A P2P-based Storage Platform for Storing Session Data in Internet Access Networks. - PowerPoint PPT Presentation

TRANSCRIPT

A P2P-based Storage Platform for Storing Session Data in Internet Access Networks

T. Bahls, D. Duchow

Nokia Siemens Networks

Broadband Access Division

Greifswald, Germany

World Telecommunication Congress 2010

Network & Service Management ReliabilitySeptember 13-14

Peter Danielis, M. Gotzmann, D. Timmermann

University of Rostock, Germany

Institute of Applied Microelectronics

and Computer Engineering

Outline

Introduction & Motivation

Utilization of P2P Technology

Erasure Resilient Codes for High Data Availability

Realization of the P2P-based Storage Platform

Summary

2

Introduction & Motivation

Internet Service Providers (ISPs) provide Internet access

Access nodes (ANs) = essential network elementsE.g., DSLAMs (Digital Subscriber Line Access Multiplexers)

3

AN 1

AN 2

AN 3

AN 4

Customers

Internet

Introduction & Motivation

Access nodes (ANs) = essential network elements

ANs have to be powerful but well-priced ANs ≠ servers!Budget with available resources!

4

AN 1

AN 2

AN 3

AN 4

Customers

Internet

$

$

$

$

Introduction & Motivation

Access nodes (ANs) = essential network elements

ANs need resets (or may fail) data must not be lost!AN configuration data needs to be saved persistently!But there‘s more…

5

AN 1

AN 2

AN 3

AN 4

Customers

Internet

Introduction & Motivation

Data - called session data - …… comprises MAC/IP addresses, IP lease times of customers… is required for data forwarding/traffic filtering

6

AN 1

AN 2

AN 3

AN 4

Customers

Internet

MAC address: 00-50-04-E1-15-A0IP address:139.30.201.254Lease Time: 60 minActive: No

DHCP Request:I have MAC address00-50-04-E1-15-A0!

DHCP Response:Your IP address is139.30.201.254 for 60 min!

Introduction & Motivation

Data - called session data - …… comprises MAC/IP addresses, IP lease times of customers… is required for data forwarding/traffic filtering… has to be always available persistent storage needed… is highly volatile due to continous changes

7

AN 1

AN 2

AN 3

AN 4

Customers

Internet

MAC address: 00-50-04-E1-15-A0IP address:139.30.201.254Lease Time: 60 minActive: Yes

DHCP Request:I have MAC address00-50-04-E1-15-A0!

DHCP Response:Your IP address is139.30.201.254 for 60 min!

Introduction & Motivation

Today: ANs store session data in persistent flash memory

Problem: Flash memory limited availability/rewritability

ISPs „sacrifice“ flash memory for session data reluctantly

8

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

Today: ANs store session data in persistent flash memory

Problem: Flash memory limited availability/rewritability

Solution: Use available volatile RAM resources of ANs!

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

RAM(Volatile)

Introduction & Motivation

9

Average AN, e.g., PowerQuicc III (Freescale Semiconductor)

RAM capacity = 1 Gbyte + unlimited rewritability

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

RAM(Volatile)

40%free

Introduction & Motivation

10

Average AN, e.g., PowerQuicc III (Freescale Semiconductor)

Calculating capacity = 1234 Dhrystone MIPS

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

CalculatingCapacity

RAM(Volatile)

40%free

Introduction & Motivation

11

Average AN, e.g., PowerQuicc III (Freescale Semiconductor)

Calculating capacity = 1234 Dhrystone MIPS

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

CalculatingCapacity

RAM(Volatile)

40%free

40%free

Introduction & Motivation

12

Average AN, e.g., PowerQuicc III (Freescale Semiconductor)

Problem: How to efficiently utilize available resources?

Access AreaCustomers Area Core Network Area

Access Node,e.g., DSL Access Multiplexer

CustomersBroadband Remote

Access Server

Internet Service Provider

Flash Memory(Persistent)

Internet

CalculatingCapacity

RAM(Volatile)

40%free

40%free

Introduction & Motivation

13

Outline

Introduction & Motivation

Utilization of P2P Technology

Erasure Resilient Codes for High Data Availability

Realization of the P2P-based Storage Platform

Summary

14

What options does P2P offer?

...beyond the incriminated applications, of course.

New networking paradigmNo clients and servers anymore

15

...beyond the incriminated applications, of course.

New networking paradigmNo clients and servers anymoreAll peers form a self-organizing networkNetwork = storage resourceNetwork = computing resource

Scalability and resilience = intrinsic featuresProven concept (BitTorrent, Zattoo, Joost)

What options does P2P offer?

16

Networking paradigmEach AN is part of a logical P2P overlay on its uplink

Network = Storage ResourceEach AN stores just a piece of session data

Network = Computing ResourceEach AN implements P2P protocol

But ANs may become unavailable…

Problem: How to ensure high data availability?

Utilization of P2P technology

17

Storage Capacityof ANs

Outline

Introduction & Motivation

Utilization of P2P Technology

Erasure Resilient Codes (ERCs) for High Data Availability

Realization of the P2P-based Storage Platform

Summary

18

ERCs for High Data Availability

Objective: High session data availability = 99.999 %

Simple replication wastes memory ressources

Reed-Solomon CodesSplit session data of each AN into m data chunks

19

m Session Data Chunks

Split

Session Data

ERCs for High Data Availability

Objective: High session data availability = 99.999 %

Simple replication wastes memory ressources

Reed-Solomon CodesSplit session data of each AN into m data chunksEncoding: Add k interleaved coding chunks n=m+k chunks

20

Encoding

k Coding Chunksm Session Data

Chunks

Split

Session Data

ERCs for High Data Availability

Objective: High session data availability = 99.999 %

Simple replication wastes memory ressources

Reed-Solomon CodesSplit session data of each AN into m data chunksEncoding: Add k interleaved coding chunks n=m+k chunksDecoding: Restore session data from any m of n chunks

21

Decoding

n = m+k Data-/CodingChunks, plus Erasures Session Data

Outline

Introduction & Motivation

Utilization of P2P Technology

Erasure Resilient Codes for High Data Availability

Realization of the P2P-based Storage Platform

Summary

22

Kad-based Realization

23

AN 1

AN 2

AN 3

AN 4

Customers

Internet

Kad-based Realization

Connection of access nodes (ANs) with P2P-based overlay

24

Internet

AN 1

AN 2

AN 3

AN 4

Customers

Kad-based Realization

Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring

25

AN 1

AN 2

AN 3

AN 4

Customers

Logical P2P network on top of real topology: Kad-based DHT ring

AN 1

AN 2

AN 3

AN 4

Customers

Logical P2P network on top of real topology: Kad-based DHT ring

Chunk of AN 2

Chunk of AN 3

Chunk of AN 1Chunk of AN 1

Chunk of AN 2

Chunk of AN 2

Chunk of AN 3

Session Data ChunksSession Data Chunks

Session Data Chunks Session Data Chunks

Chunk of AN 4

Chunk of AN 4

Kad-based Realization

Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring

Structured chunk storage via DHT ringAssignment of hash values to ANs and session data chunksANs save session data chunks with similar hash values

26

AN 1

AN 2

AN 3

AN 4

Customers

Logical P2P network on top of real topology: Kad-based DHT ring

Chunk of AN 2

Chunk of AN 3

Chunk of AN 1Chunk of AN 1

Chunk of AN 2

Chunk of AN 2

Chunk of AN 3

Session Data ChunksSession Data Chunks

Session Data Chunks Session Data Chunks

Chunk of AN 4

Chunk of AN 4

Kad-based Realization

Connection of access nodes (ANs) with P2P-based overlayP2P protocol: Kad-based Distributed Hash Table (DHT) ring

Structured chunk storage via DHT ringAssignment of hash values to ANs and session data chunksANs save session data chunks with similar hash values

27

Admin

Block Diagram

The main components are…

28

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

DHCP Server

Block Diagram

(1) module with controlling functionality

29

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

1

Save Session Data!

Time to Save Session Data!

DHCP Server

Block Diagram

(2) memory with own session data

30

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

2

DHCP Server

Block Diagram

(3) Kad block with ERC functionality

31

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

3

DHCP Server

Block Diagram

(4) routing table

32

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

4

DHCP Server

Block Diagram

(5) memory with session data chunks of other nodes

33

SessionData Chunks

(of other nodes)

Routing Table

KadPacket/

DataTransfer

Result from Kad lookup

ControllingFunctionality Operation

External control

Kad Functionality

ERC Functionality

Own Session

Data

Get/Save Data

Get/Save Data

5

DHCP Server

Outline

Introduction & Motivation

Utilization of P2P Technology

Erasure Resilient Codes for High Data Availability

Realization of the P2P-based Storage Platform

Summary

34

Summary

Successful development of P2P-based storage platformUtilization of free RAM instead of rarely available flash memory

Connection of access nodes by P2P overlayHigh scalability and resilience towards network errors

Efficient sharing of RAM and computing resources

ERCs for high data availability & low redundandy

Completion of fully functional prototype

35

36

Thank you! Any questions?

peter.danielis@uni-rostock.dehttp://www.imd.uni-rostock.de/networking

Backup: Related Work

37

J. Kubiatowicz et. al., “Oceanstore: An architecture for

global-scale persistent storage”, 2000

Schwarz, Xin, Miller, “Availability in Global Peer-To-Peer

Storage Systems”, 2004

Sattler, Hauswirth, Schmidt, „UniStore: Querying a DHT-

based Universal Storage“, 2007

Morariu, „DIPStorage: Distributed Storage of IP Flow

Records“, 2008

Backup: Kad-based DHT

38

02m-1611

1008

1622

2011X

Y

2207

Address space as ring (overlay)

Address space(Hash values)

H(Peer Y) = 3485

H(Peer X) = 2906

Data “D“H(“D“) = 3107

3485 - 610

611 - 1007

1008 - 1621

1622 - 2010

2011 - 2206

2207 - 2905

2906 - 3484

(3485 – 610)

D

Kad (eMule): 128 bit address space

Distances between hash values are calculated by the XOR metric

Binary tree with XOR distances of other peers to itself

Organized into k-bucketsEach peer knows many close peersEach peer knows only few distant peersEach peer has a life time

39

Backup: Kad Routing Table

1111 00004 Bit address space

1 0

1 0

1 0

1 0

Backup: Kad Bootstrapping & Maintenance

BootstrappingNew peer contacts a known peer and inserts itself on ring

MaintenanceContact peers from routing table with expired life timeContact other peers periodically to learn new contacts

40

Backup: Kad Lookup Process

Searching peer selects peers close to target

41

These peers are contacted via a request

Some respond with new peers

Searching peerPossible contactsfrom routing table

Target

00..00 11..11Searching tolerance

128 bit address space

REQ

REQREQ

RES

RES

Backup: Kad Lookup Process

Some of the new peers are contacted

Some of them respond

42

New possible contacts

REQ

REQ

REQ

RES

RES

Responding peers within a defined search tolerance

Action request: Execute the action!

If they send an action response, a counter is increased

If counter==defined value, the lookup terminates

Otherwise, it is terminated via a timeout

43

Backup: Kad Lookup Process

ACTION REQ

ACTION RESanswers++

REQREQ

REQ

Backup: Prototype

44

Customers

P2P Functionality

Access Node

DHCP

Administrator (ISP)

Administer

Start/Stop

No DHCP Traffic

Session F

ilter

No Kad Traffic

Configure

Get/SaveData

Indicate Changes

ExternalControl

DHCP Server

KadTraffic/Data

Transfer

ConfigureControlModule

Backup: Related Issues

45

Benefit from using ERCs instead of data replication

Moderate quantitative memory savings

But significantly higher data availability

Kad network: open source is high quality!

Minimal traffic overhead introduced by Kad maintenance

Backup: Memory requirements & performance

46

Currently, prototype is ported to a Xilinx FPGA board

Long-time test/simu of prototype at our institute intended

Functional verification

Determination of performance

Determination of memory requirements

Determination of CPU utilization

top related