a software design and algorithms for multicore capture in data center forensics

32

Upload: marat-zhanikeev

Post on 16-May-2015

200 views

Category:

Technology


1 download

DESCRIPTION

With rapid dissemination of cloud computing, data centers are quickly turning into platforms that host highly heterogeneous collections of services. Traditional approach to security and performance management finds it difficult to cope in such environments. Specifically, it is becoming increasingly difficult to capture and process all the necessary information at data centers in real time, where packet capture at data center gateways can serve as a practical example. This paper proposes a generic design for capturing and processing information on multicore architectures. The two main parts of the proposal are (1) the optimization formulation for distributing tasks across cores and (2) practical design and implementation of a shared memory which can be used for communication between processes in a non-traditional way that does not require memory locking or message passing.

TRANSCRIPT

Page 1: A Software Design and Algorithms for Multicore Capture in Data Center Forensics
Page 2: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

On the Way IN: DC Forensics

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 2/28...

2/28

Page 3: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics Basics

.(traditional) Forensics Stages.....

.... are collection, examination, analysis, and reporting

• many challenges in data centers

• collection: realtime is really really really difficult

• examiation: you can't examine what you can't collect, also flexibility is important

• analysis: deeper form of examination, same problems

• reporting: that part is actually easy, but DCs have no standards◦ one standard is offered later in this presentation

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 3/28...

3/28

Page 4: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics : All is Traffic

.Statement..

.All information in data centers can be reduced to the traffic form• logs are information carried on packets

• logging, storage, etc. are distributed -- have to be communicate usingtraffic

• a corrolary: if something is not traffic, it might be useful to convert it intotraffic

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 4/28...

4/28

Page 5: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Practical DC Forensics

• we wantDeep Packet Inspection (DPI) back on the table

• we want to not use sampling, but capture everything• we want to differentiate attention spent to different classes oftraffic◦ called context-based sampling◦ probability of capture/inspection depends on current context

• note: all these are gradually removed from practice for infeasibility

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 5/28...

5/28

Page 6: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional Multicore

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 6/28...

6/28

Page 7: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Generic Multicore Design

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 7/28...

7/28

Page 8: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Generic Multicore Capture

• 2 roles: manager andcore

• traditional parallelprocessing: messagepassing or sharedmemory 05 06

05 M.Aldinucci+2 "FastFlow: Efficient Parallel Streaming Applications on Multi-core" U.Pisa Techreport (2009)

06 R.Brightwell "Workshop on Managed Many-Core Systems" 1st Managed Many-Core Systems (2008)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 8/28...

8/28

Page 9: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional Shortcomings.Reality is.....

.

... that traditional parallel processing designs are extremely inefficienton multicore

• overhead from parallelization is too high

• unit of processing is too small

• streamline designs are rare but are recently discussed in BigData 08

.The solution is.....

.... to use a lockfree (message-less) parallelization design

08 R.Chen+2 "Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling" 19th PACT (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 9/28...

9/28

Page 10: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional → Proposed

• spawn, but don't wait to merge

• collect results form corescontinuously to avoid lumps

• get used to not being able tocommunicate to cores (nomessages)◦ relatively short tasks diminish this

effect 02

02

myself+0 "Experiments with Practical On-Demand Multi-Core

Packet Capture" APNOMS (2013)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 10/28...

10/28

Page 11: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : the New Multicore

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 11/28...

11/28

Page 12: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : Mission Statement

.Proposal Components..

.

• lockfree design

• tasks-into-cores packing problem and optimization

• implementation that support lockfree design

• remember: the easiest way to aggregate traffic is to use IP address prefixes• again, generic, so we do not care about the contents

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 12/28...

12/28

Page 13: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : Shared Memory

• communication happens over

shared memory04

• C/C++ implementationis common, but will work inother languages as well

• shared memory is persistent,but cores come and go

04 K.Michael "The Linux Programming Interface" No Starch Press (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 13/28...

13/28

Page 14: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : DLL is Key.DLL stands for.....

.... Double Linked List• common in C/C++designs

• extremely flexible --you can swapelements byreassigning pointers

• sidewaysDLL is a methodto avoid collisions inhashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 14/28...

14/28

Page 15: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Optimization

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 15/28...

15/28

Page 16: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Optimization Targets

• few cores, many data units• need to pack latter into former

• moreover: scheduling problem, which is packing but along the timeline

• moreover(2) : when packing, do you randomize input or not -- hashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 16/28...

16/28

Page 17: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Prefix Packing Problem

minimize w1count(P) + w2max(M) + w3var(C)

subject of k1 < pi < k2 ∀ pi ∈ P.

Hashkey - 32 bits 0 -

k1 (shortest) k2

(longest)

Effective range

Core0 Core1 Core2 …

p (prefix)

p1 p3

p2 p4 p5 p6 p8

p7 m (max)

n

Prefix Packing Problem

• prefix length between k1 and k2s◦ hashkey or raw◦ fixed in each run in this paper

• pi is a pack (group) of items

• n total items, mapped to set M ofprefixes in each of m cores

• C a set of item counts c acrossprefixes,

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 17/28...

17/28

Page 18: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Prefix Packing GA Heuristic

• Generic Algorithm (GA) 12

• chromosome is a tuple of prefixes packed into one core

gi = ⟨pi,1, pi,2, ..., pi,m⟩. (1)

• one gene (whole solution) is a tuple containing all chromosomes

Gj = ⟨g1, g2, ..., gn⟩. (2)

12 D.Knysh+1 "Parallel Genetic Algorithms: a Survey and Problem State of the Art" IJCSS (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 18/28...

18/28

Page 19: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 19/28...

19/28

Page 20: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis Setup

• actual packet traces -- trace-based simulation 16

• input: 2 cases -- hashing verus raw

• items are individual packets◦ packets arepacked into prefixes◦ prefixes arepacked into cores

• the above GA optimization heuristic

16 myself "MAWI Working Group Traffic Archive" http://mawi.wide.ad.jp/mawi (2014)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 20/28...

20/28

Page 21: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis (1) Cores

0 1 2 3 4 5 6 7 8 9Time sequence

4.64.74.84.9

55.15.25.35.45.5

log(

max

item

cou

nt /

cor

e) 1 core

2 cores

3 cores4 cores

5 cores6 cores

7 cores

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 21/28...

21/28

Page 22: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis (2) Hashing

0 0.2 0.4 0.6 0.8 1Increasing cutoff parameter

0

40

80

120

160

200

240

Num

ber o

f uni

que

pref

ixes

hashedraw

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 22/28...

22/28

Page 23: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics 2.0

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 23/28...

23/28

Page 24: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics 2.0• reporting part: let's use sketches from data streaming 11

Core 1

Core 1

Core X

TABID Manager

Now(replay)

….

BIG DATA TIMELINE Cursor

Time Direction

One Sketch One Sketch One Sketch Start End End End

Read/prepare

Shared Memory

Start

11 M.Sung+3 "Scalable and Efficient Data Streaming Algorithms for Detecting Common Content..." ICDE (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 24/28

...

24/28

Page 25: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Wrapup

• a natively multicore technology is proposed

• performance is opitimized using a packing heuristic• raw input is found to be preferable to randomization

• future topics:1. variable-length prefixes2. optimization along the timeline3. jitter minimization (fewer reasignments)4. further lookup optimiation -- fast hashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 25/28...

25/28

Page 26: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

That’s all, thank you ...

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 27: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

[01] myself+0 (2013)...community-based architecture for measuring E2E QoS at DCcIJCSE

[02] myself+0 (2013)Experiments with Practical On-Demand Multi-Core Packet CaptureAPNOMS

[03] myself+1 (2013)A Graphical Method for Detection of Flash Crowds in TrafficTelecom. Systems (TM)

[04] K.Michael (2010)The Linux Programming InterfaceNo Starch Press

[05] M.Aldinucci+2 (2009)FastFlow: Efficient Parallel Streaming Applications on Multi-coreU.Pisa Techreport

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 28: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

[06] R.Brightwell (2008)Workshop on Managed Many-Core Systems1st Managed Many-Core Systems

[07] X.Sui+3 (2010)Parallel Graph Partitioning on Multicore Architectures23rd LCPC

[08] R.Chen+2 (2010)Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling19th PACT

[09] I.Machdi+2 (2009)Executing parallel TwigStack algorithm on a multi-core system11th IIWAS

[10] S.Stoichev+1 (2009)Parallel Algorithm for Integer Sorting with Multi-Core ProcessorsIT and Control

[11] M.Sung+3 (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28

...

26/28

Page 29: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Scalable and Efficient Data Streaming Algorithms for Detecting Common Content...ICDE

[12] D.Knysh+1 (2010)Parallel Genetic Algorithms: a Survey and Problem State of the ArtIJCSS

[13] Luca Deri (2009)Modern Packet Capture and Analysis: Multi-Core, Multi-Gigabit, and BeyondIM

[14] myself (2014)MCoreMemory project pagehttps://github.com/maratishe/mcorememory

[15] myself (2013)Rings-on-Cores project pagehttps://github.com/maratishe/ringsNcores

[16] myself (2014)MAWI Working Group Traffic Archive

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 30: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

http://mawi.wide.ad.jp/mawi

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...

27/28

Page 31: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Extras (1) Per-Unit Cost

Hashing

Increasing Per-Unit Cost

Manager

Prefix Matching

Cores that do not match

Process

Stage 1 Stage 2 Stage 3

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...

27/28

Page 32: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Extras (2) Share Memory Trick

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 28/28...

28/28