a.c. chen 2012/07/23 @ adl m zubair rafique muhammad khurram khan khaled alghathbar muddassar farooq...

33
A.C. Chen 2012/07/23 @ ADL A FRAMEWORK FOR DETECTING MALFORMED SMS ATTACK M Zubair Rafique Muhammad Khurram Khan Khaled Alghathbar Muddassar Farooq The 8th FTRA International Conference on Secure and Trust Computing, data management, and Applications ( STA 2011 ) 1

Upload: oscar-ramsey

Post on 25-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

A.C. Chen 2012/07/23 @ ADL 1

A FRAMEWORK FOR DETECTING MALFORMED SMS ATTACK

M Zubair Rafique

Muhammad Khurram Khan

Khaled Alghathbar

Muddassar Farooq

The 8th FTRA International Conference on Secure and Trust Computing, data management, and Applications ( STA 2011)

A.C. Chen 2012/07/23 @ ADL 2

Outline

Introduction Malformed message detection

framework Evaluation and experimental results Conclusion

A.C. Chen 2012/07/23 @ ADL 3

Introduction Malformed message detection

framework Evaluation and experimental results Conclusion

A.C. Chen 2012/07/23 @ ADL 4

SMS Deliver Process

SMS_SUBMIT

SMS_DELIVER

BSC: Base Station Controller

MSC: Mobile Switch Center

GMSC: Gateway MSC

IWMSC: Interworking MSC

A.C. Chen 2012/07/23 @ ADL 5

Short Message Service ( SMS )

A message sent to and from a mobile phone are first sent to an intermediate component called the Short Message Service Center (SMSC)

The SMS message exists in 2 formats SMS_SUBMIT: mobile phone to SMSC SMS_DELIVER: SMSC to mobile phone

A.C. Chen 2012/07/23 @ ADL 6

GSM Modem The SMS received on a mobile phone

is handled through the GSM modem Provides an interface with the GSM network

and the application processor of a smart phone Controlled through standardized AT commands

Apps

Telephony Stack

Modem

AT commands

AT Result Codes

Responsible for cellular communications

Responsible for the communication between application processor and the modem

A.C. Chen 2012/07/23 @ ADL 7

Example: SMS_DELIVER///AT Result Code + the length of SMS

Complete SMS string in hex.

A.C. Chen 2012/07/23 @ ADL 8

Malformed SMS attack

Cause the application processor to reach an undefined state Significant processing delays Unauthorized access Denying legitimate users access …

Apps

Telephony Stack

Modem

However, malformed message detection in mobile phones has received little attention

A.C. Chen 2012/07/23 @ ADL 9

In this Paper…

A malformed message detection framework was proposed Automatically extracts novel syntactical

features to detect a malformed SMS at the access layer of mobile phones

A.C. Chen 2012/07/23 @ ADL 10

Introduction Malformed message detection

framework Evaluation and experimental results Conclusion

A.C. Chen 2012/07/23 @ ADL 11

Common Idea

Anomalies are deviations from a learnt normal model [Patrick Dssel, et al.] Learning→Normal model→Anomaly detection Supported by our pilot studies

• The distance values of malformed messages are normally greater than those of benign messages

A.C. Chen 2012/07/23 @ ADL

SMS Detection Framework

MessageAnalyzer

FeatureExtractio

n

FeatureSelection

Classification

12

A.C. Chen 2012/07/23 @ ADL

Message Analyzer

Message dissection Transform incoming SMS messages into a

format from which we can extract intelligent features

Extracts the complete SMS message string i.e. the second line of AT Result code

FeatureExtraction

FeatureSelection

ClassificationMessageAnalyzer 13

A.C. Chen 2012/07/23 @ ADL 14

Extraction of String Features

Mine features from an incoming SMS message Exploit the properties of a suffix tree Use a set of attribute strings to model the

content of the incoming message Entrenching function : Extracts the

( attribute, value ) pair from the suffix tree attribute: a feature string a value: the frequency of a from the nodes of the

suffix tree Example

FeatureExtraction

FeatureSelection

ClassificationMessageAnalyzer

A.C. Chen 2012/07/23 @ ADL 15

Raw Model Vectors For the purpose of training, we

prepared a training data set 𝛫: Set of messages used for training, ={ 𝛫 m1,

…,mk }

After each mi passes through the entrenching function, we have our raw model

FeatureExtraction

FeatureSelection

ClassificationMessageAnalyzer

A.C. Chen 2012/07/23 @ ADL 16

Feature Selection

The high dimensionality of the raw model will result in large processing overheads

Remove redundant features having low classification potential Not at the cost of a high false alarm rate

MessageAnalyzer

FeatureExtraction

ClassificationFeature

Selection

A.C. Chen 2012/07/23 @ ADL 17

Selection Techniques

Use 3 selection mechanisms to obtain 3 distinct model set of attributes Information Gain (IG) Gain Ratio (GR) Chi Squared (CH)

MessageAnalyzer

FeatureExtraction

ClassificationFeature

Selection

A.C. Chen 2012/07/23 @ ADL 18

Distance/Divergence

For a given vector of pairs, compute the deviation ( message score, distance ) of the vector

Use 2 well-known distance measures to obtain the score Manhattan distance (md) Itakura-Saito Divergence (isd)

MessageAnalyzer

FeatureExtraction

FeatureSelection Classification

A.C. Chen 2012/07/23 @ ADL 19

Classification

Threshold value The largest distance score of a message in the

training model

Raise an alarm If the distance score of an incoming SMS is

greater than the threshold value

MessageAnalyzer

FeatureExtraction

FeatureSelection Classification

A.C. Chen 2012/07/23 @ ADL

ReviewTraining is only required in the beginning

20

threshold

message score

A.C. Chen 2012/07/23 @ ADL 21

Introduction Malformed message detection

framework Evaluation and experimental results Conclusion

A.C. Chen 2012/07/23 @ ADL 22

Evaluation

Collect real world dataset of SMS message ≥ 5000 benign datasets

• Developed modem terminal interface to collect more than 5000 real world benign SMS dataset

≥ 5000 malformed datasets• SMS injection framework ( Mulliner, C., et al., 2009)

A.C. Chen 2012/07/23 @ ADL 23

Experimental Goal

To select the best feature selection technique and distance measure 3 feature selection modules

• Information Gain (IG)• Gain Ratio (GR) • Chi-squared (CH)

2 distance measures• Manhattan distance (md)• Itakura-Saito Divergence (isd)

A.C. Chen 2012/07/23 @ ADL 24

Parameters and Definitions

Used 4 parameters to define the detection accuracy and the false alarm rate True Positive (TP), False Positive (FP), False

Negative (FN), True Negative (TN) Detection Rate

False Alarm Rate

A.C. Chen 2012/07/23 @ ADL 25

Results: Receiver Operating Characteristic Curves

ROC using Manhattan Distance ROC using Itakura-Saito Divergence

A.C. Chen 2012/07/23 @ ADL 26

Results: Overheads Training and Threshold calculation overheads in ( ms/100 SMS ) Testing overheads in ( ms/1 SMS ) using Information Gain, Gain Ratio

and Chisquared for Manhattan distance and Itakura-Saito Divergence

Average training time = 3.5s/100SMS

Average detection time of a malformed message = 10ms

Provides the best performance

A.C. Chen 2012/07/23 @ ADL 27

Introduction Malformed message detection

framework Evaluation and experimental results Conclusion

A.C. Chen 2012/07/23 @ ADL 28

Conclusion

A real time malformed message detection framework Tested on real datasets of SMS messages Successfully detects malformed messages with

a detection accuracy of more than 98% The future research will focus on

further optimizing and deploying it on real world mobile devices and smart phones

A.C. Chen 2012/07/23 @ ADL 29

Q & A

A.C. Chen 2012/07/23 @ ADL 30

Example of a Suffix Tree

Extract feature strings from an incoming message m=0110223 The set of attribute strings is thus generated

FeatureExtraction

FeatureSelection

ClassificationMessageAnalyzer

A.C. Chen 2012/07/23 @ ADL 31

Example of Entrenching Function

Message m=0110223

Set of attribute:

{3, 0, 1, 2, 23, 223, 110223, 10223, 0223,

0110223}

Vector of pairs

=(3, 1), (0, 2), (1, 2), (2, 2), (23, 1), (223, 1)…

FeatureExtraction

FeatureSelection

ClassificationMessageAnalyzer

A.C. Chen 2012/07/23 @ ADL 32

The RIL in the context of Android's Telephony system architecture [ref]

A.C. Chen 2012/07/23 @ ADL 33

Modules that implement telephony functionality