network measurement and monitori - assigment 1, group3, "classification"

37
Classification Patrick Herbeuval University of Liège 1st Master in Computer Science [email protected] Valentin Thirion University of Liège 1st Master in Computer Science [email protected] Networking measurements and monitoring 1st assigment: Oral Presentation Teacher: B. DONNET [email protected]

Upload: valentin-thirion

Post on 29-Jun-2015

92 views

Category:

Technology


0 download

DESCRIPTION

Created by Patrick Herbeuval and Valentin Thirion

TRANSCRIPT

Page 1: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification

Patrick Herbeuval University of Liège

1st Master in Computer Science [email protected]

Valentin ThirionUniversity of Liège

1st Master in Computer Science [email protected]

Networking measurements and monitoring

1st assigment: Oral Presentation

Teacher: B. [email protected]

Page 2: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Plan

I. Introduction

Four papersII. Early Application Identification

III. Multilevel classifier: BLINC

IV. Statistical: The ADSL Case

V. Application specific: Skype

VI. Comparative

VII. Conclusion

Page 3: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

I - Introduction

Internet is more and more used today

We want to keep the network comfortable enough

The quality of service asked by consumers increases as fast as applications consumes more bandwidth

ISPs, companies and universities want to ban P2P

Port based classifiers were good years ago, quite inefficient now

Page 4: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Why classify?

Classification is today a key issue for today’s network administrators and companies for the following reasons:

• Improve the network infrastructure

• Ban undesired traffic

• Protect the network against potential attacks

• Global knowledge of trends

Page 5: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

How classify?

Deep Packet Inspection (DPI): verry precise technique but lots of drawbacks:

Huge computation power needed

Unneficient if packets are crypted

Continuous need of database updates

Statistical analysis

Social

Page 6: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

II - Early Application Identification

Goal: determine the app with the first few packets

Advantage: knowing the kind of traffic in the beginning, ability to block, redirect it

DPI consumes too much ressources and flows need to be ended to be analysed

Statistical: usage of the mean sizes, durations, … these are values that are not available for the first few packets

Page 7: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Clustering the flows

Techniques used: K-Means, Gaussian Mixture Model, special

Values used:Size of the first few packets

Duration of the first few packets (negociation phase)

Page 8: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Data set

4 packet traces3 from a University network

1 from an enterprise network

Keep only TCP packets and trash the ones that flow began before the trace capture

Features analysed: need for an efficient metric

Size and direction of the first 4 packets

We can observe that the range of theses values is very similar across traces, see graph next slide

Page 9: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Size & Directio

n

Page 10: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification, 2 phases

Training phase: offline at management sites.Apply clustering techniques to samples of TCP connections for all target applications

Creation of a spatial representation based on the sizes of the first P packets (vector of P dimensions or HMM)

Then find applications that have the same behaviour

Best results: 40 clusters and the 4 first packets

Creation of two sets:One with the description of each cluster

One with applications present in each cluster

Page 11: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification, 2 phasesClassification phase: online at management hosts

Extract the 5-tuple and analysis of the size of packets in all directions

With this size, use the assigment module (associates a connection to a cluster)

With the clusters, the labelling module selects the application associated with the connection

Page 12: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Evaluation & ConclusionEvaluation

Assigment accuracy: above 95% for all heuristics

Labbeling accuracy: between 85% and 98%

The size of first few packet is a good metric

Quality of clustering is richer with HMM but comparable with Euclidean

GMM Clustering with TCP ports classifies over 98% of know applications

Limitation: need the first 4 packets in the correct order

Heuristic: (Wikipedia) Where the exhaustive search is impractical (NP-complete for instance), heuristic methods are used to speed up the process of finding a satisfactory solution.

Page 13: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

III – The BLINC Classifier

Stands for BLINd Classification

Avoid reading the whole content of the packetPrivacy, performance, cyphered packets

3 levels of classificationSocial level

Functional level

Application level

Page 14: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

The Social level

Finding host communitiesClient-server, P2P, …

Analyse these communitiesPerfect match : likely malicious

Partial overlap : P2P sources, websites, gaming, …

Partial overlap within the same subnet : farms

Page 15: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

The Social level (2)

Page 16: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

The functional level

Find if a host offers a service, uses it or both

Mostly depending on the port range used by this host

Works better when a host is connected to many servers

Typical schemes: HTTP server: 1-2 ports

P2P: many ports (to 1 per host)

Mail server: depending on services available

Page 17: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

The application level

Using the connections 4-tuple (+ maybe other characteristics)

Create a model for every application type

Models are represented by little graphs called « graphlets »

Page 18: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

BLINC : Results

Uses 2 metrics to evaluate the classifierCompleteness (% classified traffic)

Accuracy (% correctly classified traffic)

Some parameters can be used to tune the classifierChanging a threshold can improve the results for one of the metrics, but significantly degrade the other one

Page 19: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Global results

GN : Genome campus (~1000 users), UN : university network (~20.000 users)

Page 20: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Tuning

Td : minimal # of destination IPs needed to classify the flow as P2P

Page 21: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Results (2)

Good detection rate without reading any byte of the payloadNon payload flows classified as well.

Cyphering is not a problem

Low resource consumption

Good detection of unknown flows

Difficult to distinguish applications of the same type (e.g.a ll VoIP protocols grouped as the same one)

Doesn’t work if the header are encrypted

Hard to identify multiple sources behind NATs

Results from the edge of the network, the classifier may work differently at the backbone of the network

Page 22: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

BLINC : conclusion

BLINC has a good detection rate without costing a lot of processing and without being intrusive

It can detect attacks and unknown protocols

It can be improved in some situations

Page 23: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

IV – The ADSL Case

Test statistical classifier on different sites, after having been trained on some others.

Dataset:4 packet traces collected at 3 different ADSL POPs from Orange

2 traces at the same time, different locations

2 traces at the same location, 17 days between

Reference used: ODP tool (provided by Orange)

Page 24: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification methodology

3 algorithms used to classify the tracesNaïve Bayes Kernel Estimation

Bayesian Network

C4.5 Decision Tree

Traces analysed on the two featuresSET_A: Packet Level Information

SET_B: Flow Level Statistics

3 filters:S/S: flows with 3-way-Handshake

S/S+4D: same as S/S + at least 4 data packets

S/S+F/R: same as S/S + FIN or RST flag at the end

Page 25: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification, 2 cases

Static case: classification on each site independently

Ideal number of packets: 4

Accuracy: about 90%

Great classification of WEB and EDONKEY flows

Cross-site case:SET_A: EDONKEY result immune, spatial similarity seems more important than temporal similarity.

Classifier very sensitive to the context in which it is trained

MAIL is often taken for FTP due to the packet sizes similarities

Usage of Port number increases the quality of results

Page 26: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classification, 2 cases (continued)

SET_B: some degradationsFocus on a single feature: Port number

Results are the opposite from the static case

Prediction of traffic using non-legacy ports is non efficient

Due to the heavy-hitters (typically P2P)

Global results: C4.5 algorithm is the best in term of overall accuracy for almost all cases (static + cross-site)

Degradation : C4.5 is comparable with other algorithms (≤17%)

Data overfitting problem

Page 27: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Unknown class + Conclusion

Looking for the unknown marked flows3 way handshake

Apply classifiers and get confidence level, this value is then compared to the one returned by C4.5

Useful to detect malicious traffic and P2P

Should be integrated into existing DPI tool

Conclusion:Statistical tools are very useful to identify unknown traffic

Good performances if used in the same site as training

Can detect applications among protocols

Really suffers from data overfitting (same behaviour from different apps)

Great thing about this analysis: used commercial traffic, so very differentiated

Page 28: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

V – Skype case

We want to detect Skype traffic

It’s already possible to detect VoIP traffic with other classifiers, but how to distinguish it ?

Skype is a closed and cyphered protocol, which has to be analysed before starting the classification

Page 29: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Skype model

Using a controlled environment, detection of Skype traffic characteristics

2 kinds of connections : E2E and E2OE2E : End 2 End, Skype to Skype

E2O : End 2 Out, Skype to telephone network

Skype works on TCP and UDP

Skype can carry text, voice, video and filesEverything multiplexed in 1 packet

In this case, only voice traffic is treated

Page 30: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Skype SoM

TCP packets are entirely cyphered, they cannot be analysed

UDP has a small uncyphered overhead, called Start of Message (SoM)

E2E : id and message type (signaling or data)

E20 : unique connection identifier

Skype also always uses the same port number in UDP (12340)

Page 31: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Classifiers

Chi-Square Classifier (CSC)Based on the randomness of bits in packets

Doesn’t works on TCP since cyphered packets seems to be completely random.

Naive Bayes Classifier (NBC)Real-time voice protocol classifier

Based on message size (depending of the audio codec) and on average inter-packet gap

Used on a short window of samples to cope with variability in packet size

Payload based classifierUsed in the controlled environment to check if CSC and NBC work well

Page 32: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Experiments

NBC detects all kinds of VoIP traffic

CSC detects all kinds of Skype trafficUsing both of them should detect Skype voice traffic

Page 33: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Results

Very low false positive rate

Bigger false negative rate

Page 34: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Skype : Conclusion

Skype is hard to classify due to its cyphering protocol, which makes its analysis hard to do

But with this classifier, we have good results on UDP

False positive is almost zero, good if the ISP wants to prioritarize its traffic

False negative is bigger but not really a problem while the ISP doesn’t want to block Skype

Page 35: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

VI - Comparative

All these classifiers have good results, but each of them has its strengths and weaknesses

ADSL needs specific training, but best detection rate

BLINC and Early are less precise but more flexibleThey are also faster and good to detect attacks

BLINC detects unknown protocols but cannot discern apps

Early needs the 4 first packets in order, ADSL the 3-way handshake

Skype is more specific, cannot be compared immediatelyGood false positive rate but higher false negative rate

Page 36: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

VII – Conclusion

We have now solutions that can replace DPI’s

Each classifier is good in its domainImportant network: early app detection (detect attacks soon)

ADSL and commercial: statistical (user trends, adapt infrastructure)

University or academy: BLINC (statistics, trends)

Everywhere we want to improve it: Skype classifier

Remarks:Traces and classifiers are quite old (4 to 6 years)

What about mobile usage ? Multimedia over 3/4G networks ?

Page 37: Network Measurement and Monitori - Assigment 1, Group3, "Classification"

Thanks for your attention

Any questions ?

References:

K. Karagiannis, K. Papagiannaki, M. Faloutsos. BLINC: Multilevel Traffic

Classification in the Dark. In Proc. ACM SIGCOMM. August 2005.

L. Bernaille, R. Teixeira, K. Salamatian. Early Application Identification. In Proc.

ACM CoNEXT. December 2006.

M. Pietrzyk, J.-L. Costeux, G. Urvoy-Keller, T. En-Jajjary. Challenging Statistical

Classification for Operational Usage: the ADSL Case. In Proc. ACM/USENIX Internet

Measurement Conference (IMC). Novem- ber 2009.

D.Bonfiglio,M.Mellia,M.Meo,D.Rossi,P.Tofanelli.RevealingSkype Traffic: When

Randomness Plays with You. In Proc. ACM SIGCOMM. August 2007.