Transcript
Page 1: A Software Design and Algorithms for Multicore Capture in Data Center Forensics
Page 2: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

On the Way IN: DC Forensics

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 2/28...

2/28

Page 3: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics Basics

.(traditional) Forensics Stages.....

.... are collection, examination, analysis, and reporting

• many challenges in data centers

• collection: realtime is really really really difficult

• examiation: you can't examine what you can't collect, also flexibility is important

• analysis: deeper form of examination, same problems

• reporting: that part is actually easy, but DCs have no standards◦ one standard is offered later in this presentation

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 3/28...

3/28

Page 4: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics : All is Traffic

.Statement..

.All information in data centers can be reduced to the traffic form• logs are information carried on packets

• logging, storage, etc. are distributed -- have to be communicate usingtraffic

• a corrolary: if something is not traffic, it might be useful to convert it intotraffic

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 4/28...

4/28

Page 5: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Practical DC Forensics

• we wantDeep Packet Inspection (DPI) back on the table

• we want to not use sampling, but capture everything• we want to differentiate attention spent to different classes oftraffic◦ called context-based sampling◦ probability of capture/inspection depends on current context

• note: all these are gradually removed from practice for infeasibility

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 5/28...

5/28

Page 6: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional Multicore

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 6/28...

6/28

Page 7: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Generic Multicore Design

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 7/28...

7/28

Page 8: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Generic Multicore Capture

• 2 roles: manager andcore

• traditional parallelprocessing: messagepassing or sharedmemory 05 06

05 M.Aldinucci+2 "FastFlow: Efficient Parallel Streaming Applications on Multi-core" U.Pisa Techreport (2009)

06 R.Brightwell "Workshop on Managed Many-Core Systems" 1st Managed Many-Core Systems (2008)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 8/28...

8/28

Page 9: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional Shortcomings.Reality is.....

.

... that traditional parallel processing designs are extremely inefficienton multicore

• overhead from parallelization is too high

• unit of processing is too small

• streamline designs are rare but are recently discussed in BigData 08

.The solution is.....

.... to use a lockfree (message-less) parallelization design

08 R.Chen+2 "Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling" 19th PACT (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 9/28...

9/28

Page 10: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Conventional → Proposed

• spawn, but don't wait to merge

• collect results form corescontinuously to avoid lumps

• get used to not being able tocommunicate to cores (nomessages)◦ relatively short tasks diminish this

effect 02

02

myself+0 "Experiments with Practical On-Demand Multi-Core

Packet Capture" APNOMS (2013)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 10/28...

10/28

Page 11: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : the New Multicore

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 11/28...

11/28

Page 12: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : Mission Statement

.Proposal Components..

.

• lockfree design

• tasks-into-cores packing problem and optimization

• implementation that support lockfree design

• remember: the easiest way to aggregate traffic is to use IP address prefixes• again, generic, so we do not care about the contents

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 12/28...

12/28

Page 13: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : Shared Memory

• communication happens over

shared memory04

• C/C++ implementationis common, but will work inother languages as well

• shared memory is persistent,but cores come and go

04 K.Michael "The Linux Programming Interface" No Starch Press (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 13/28...

13/28

Page 14: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Proposal : DLL is Key.DLL stands for.....

.... Double Linked List• common in C/C++designs

• extremely flexible --you can swapelements byreassigning pointers

• sidewaysDLL is a methodto avoid collisions inhashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 14/28...

14/28

Page 15: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Optimization

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 15/28...

15/28

Page 16: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Optimization Targets

• few cores, many data units• need to pack latter into former

• moreover: scheduling problem, which is packing but along the timeline

• moreover(2) : when packing, do you randomize input or not -- hashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 16/28...

16/28

Page 17: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Prefix Packing Problem

minimize w1count(P) + w2max(M) + w3var(C)

subject of k1 < pi < k2 ∀ pi ∈ P.

Hashkey - 32 bits 0 -

k1 (shortest) k2

(longest)

Effective range

Core0 Core1 Core2 …

p (prefix)

p1 p3

p2 p4 p5 p6 p8

p7 m (max)

n

Prefix Packing Problem

• prefix length between k1 and k2s◦ hashkey or raw◦ fixed in each run in this paper

• pi is a pack (group) of items

• n total items, mapped to set M ofprefixes in each of m cores

• C a set of item counts c acrossprefixes,

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 17/28...

17/28

Page 18: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Prefix Packing GA Heuristic

• Generic Algorithm (GA) 12

• chromosome is a tuple of prefixes packed into one core

gi = ⟨pi,1, pi,2, ..., pi,m⟩. (1)

• one gene (whole solution) is a tuple containing all chromosomes

Gj = ⟨g1, g2, ..., gn⟩. (2)

12 D.Knysh+1 "Parallel Genetic Algorithms: a Survey and Problem State of the Art" IJCSS (2010)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 18/28...

18/28

Page 19: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 19/28...

19/28

Page 20: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis Setup

• actual packet traces -- trace-based simulation 16

• input: 2 cases -- hashing verus raw

• items are individual packets◦ packets arepacked into prefixes◦ prefixes arepacked into cores

• the above GA optimization heuristic

16 myself "MAWI Working Group Traffic Archive" http://mawi.wide.ad.jp/mawi (2014)

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 20/28...

20/28

Page 21: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis (1) Cores

0 1 2 3 4 5 6 7 8 9Time sequence

4.64.74.84.9

55.15.25.35.45.5

log(

max

item

cou

nt /

cor

e) 1 core

2 cores

3 cores4 cores

5 cores6 cores

7 cores

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 21/28...

21/28

Page 22: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Analysis (2) Hashing

0 0.2 0.4 0.6 0.8 1Increasing cutoff parameter

0

40

80

120

160

200

240

Num

ber o

f uni

que

pref

ixes

hashedraw

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 22/28...

22/28

Page 23: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics 2.0

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 23/28...

23/28

Page 24: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Forensics 2.0• reporting part: let's use sketches from data streaming 11

Core 1

Core 1

Core X

TABID Manager

Now(replay)

….

BIG DATA TIMELINE Cursor

Time Direction

One Sketch One Sketch One Sketch Start End End End

Read/prepare

Shared Memory

Start

11 M.Sung+3 "Scalable and Efficient Data Streaming Algorithms for Detecting Common Content..." ICDE (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 24/28

...

24/28

Page 25: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Wrapup

• a natively multicore technology is proposed

• performance is opitimized using a packing heuristic• raw input is found to be preferable to randomization

• future topics:1. variable-length prefixes2. optimization along the timeline3. jitter minimization (fewer reasignments)4. further lookup optimiation -- fast hashing

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 25/28...

25/28

Page 26: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

That’s all, thank you ...

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 27: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

[01] myself+0 (2013)...community-based architecture for measuring E2E QoS at DCcIJCSE

[02] myself+0 (2013)Experiments with Practical On-Demand Multi-Core Packet CaptureAPNOMS

[03] myself+1 (2013)A Graphical Method for Detection of Flash Crowds in TrafficTelecom. Systems (TM)

[04] K.Michael (2010)The Linux Programming InterfaceNo Starch Press

[05] M.Aldinucci+2 (2009)FastFlow: Efficient Parallel Streaming Applications on Multi-coreU.Pisa Techreport

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 28: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

[06] R.Brightwell (2008)Workshop on Managed Many-Core Systems1st Managed Many-Core Systems

[07] X.Sui+3 (2010)Parallel Graph Partitioning on Multicore Architectures23rd LCPC

[08] R.Chen+2 (2010)Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling19th PACT

[09] I.Machdi+2 (2009)Executing parallel TwigStack algorithm on a multi-core system11th IIWAS

[10] S.Stoichev+1 (2009)Parallel Algorithm for Integer Sorting with Multi-Core ProcessorsIT and Control

[11] M.Sung+3 (2006)M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28

...

26/28

Page 29: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Scalable and Efficient Data Streaming Algorithms for Detecting Common Content...ICDE

[12] D.Knysh+1 (2010)Parallel Genetic Algorithms: a Survey and Problem State of the ArtIJCSS

[13] Luca Deri (2009)Modern Packet Capture and Analysis: Multi-Core, Multi-Gigabit, and BeyondIM

[14] myself (2014)MCoreMemory project pagehttps://github.com/maratishe/mcorememory

[15] myself (2013)Rings-on-Cores project pagehttps://github.com/maratishe/ringsNcores

[16] myself (2014)MAWI Working Group Traffic Archive

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28...

26/28

Page 30: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

http://mawi.wide.ad.jp/mawi

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...

27/28

Page 31: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Extras (1) Per-Unit Cost

Hashing

Increasing Per-Unit Cost

Manager

Prefix Matching

Cores that do not match

Process

Stage 1 Stage 2 Stage 3

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28...

27/28

Page 32: A Software Design and Algorithms for Multicore Capture in Data Center Forensics

.

Extras (2) Share Memory Trick

M.Zhanikeev -- [email protected] -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 28/28...

28/28


Top Related