ph.d. research overview by: parbati kumar manna dr. …pkmanna/proposal/research2.pdf · ph.d....
TRANSCRIPT
Detection, Designing, and Propagation Modeling of Advanced Internet Worms
Ph.D. research overview by:Parbati Kumar Manna
Co-advised by:Dr. Sanjay Ranka and Dr. Shigang Chen
2
Overview
• Research opportunities in the area of Internet worm
• Contributions towards my dissertation:Detection of text worm
Finding the optimal scanning strategy
Propagation modeling for Permutation-Scanning worm
3
• Computer Security Vs. Network Security
• MalwareComputer Viruses
Internet Worms
Trojans
Rootkits
Introduction
4
Internet Worm
• Huge damage potentialInfects hundreds of thousands of computersCosts millions of dollars in damageMelissa, ILOVEYOU, Code Red, Nimda, Slammer, SoBig, MyDoom
• Mostly uses Buffer Overflow• Propagation is automatic• Characterized by its host-level
and network-level behavior
5
Recent Trends
• Worms becoming increasingly evasive and obfuscative
• Arrival of Script Kiddies
• Emergence of Zero-day worms
• Shift in hacker’s mindset
6
Problem I
Detection of ASCII Worm
7
Motivation
• Presumption of text being benign
• Prevalence of servers expecting text-only input
• Deployment of ASCII filter for bypassing text
• Exponential disassembly cost
• High processing overhead for IDS
8
Buffer Overflow
Overflowing a buffer using an ASCII string:
9
Creation of ASCII Worm
10
Proposed Solution
Malicious Benign
• Lack of opcodes• No negative
displacement • Long decrypter• Long sequence of
valid instructions
• Contains characters that correspond to invalid instructions
• Long sequence of contiguous valid instructions unlikely
No error during execution
11
Proposed Solution
Questions:• How long is “long”?• What is the probability of false
positive for that threshold?
• Find out the maximum length of valid instruction sequence
• If it is long enough, the stream contains a worm
12
• Toss a coin n times• What is the probability that the max
inter-head distance is ?
Probabilistic Analysis
Head Invalid Instruction
Tail Valid Instruction
τ
T H T T H T T T T T H T T T
V I V V I V V V V V I V V V
τ
13
Probabilistic Analysis
n = number of coin tossesp = probability of a headXi = R.V.s for inter-head distancesXmax = Max inter-head distance
C.D.F of Xmax = Prob [Xmax ≤ x] = [1 – p(1-p)x ]n
F.P. rate α = 1 - Prob [Xmax ≤ τ] = 1 - [1 – p(1-p)τ ]n
14
Threshold Calculation
n , p, α (false positive rate)
τ (max inter-head distance)
Known
Unknown
)1log(log))1(1log(
1
ppn
−−−−
=ατThreshold
15
Threshold Calculation
With increasing n, we must choose a larger τto keep the same rate of false positive α
16
Determine n
size)n instructio (average )charactersinput ofnumber (
ICn =
E[I] = E[Prefix chain length] +E[core instruction length]
Obtained from character frequency of input data
17
1.Privileged instructions2.Wrong Segment Prefix Selector3.Un-initialized memory access
Determine p
Invalid Instructions
Only 1. and 2. can be determined on a standalone basis
18
Implementation
Traffic Data
Internet
ASCII Filter
InstructionDisassembler
InstructionSequenceAnalyzer
ASCIIWormDetector
Server
BinaryWormDetector
binary
ASCII
19
Experimental Setup
• Benign data setupASCII stream captured from live CISE network using Ethereal
• Malicious data setupExisting framework used to generate ASCII worm by converting binary worms
• Promising experimental results for max valid instruction length
Benign: all max values all below threshold τMalicious: values significantly higher than τ
20
Experimental Results
21
Contributions
• Analyzed the behavior characteristics & constraints of ASCII worms and devised a detection method
• Derived mathematical foundation for generic detection method used in other worm detection strategies
• Deterministic - no “parameter tuning”
22
Problem II
Random or Pseudo-Random? Finding the Optimal Scanning Strategy
23
Motivation
• Achieve desirable goals of scanning
Infection speed
Stealth
Fault tolerance
Bad
Good
24
Random Scanning Worm
• Uses a Pseudo-Random Number Generator (PRNG)
• Seeded differently for each host
• No idea about when to stop
• Wastes scanning power
• Easy to detect
25
NVtitirti
dtd ×−
××=)](1[)()]([
Current Propagation Model
• Current propagation model for RCS worm uses logistics equation
• For each scan message, the probability of finding an uninfected vulnerable host is assumed to be constant
IP Space
Infected % Vulnerable Host Population
# scan messagesProbability of success
Uninfected %Scanning rate
26
Current Propagation Model
• Pseudo-random is not Random
• Constant probability assumption is wrong
2
4
1
3
5
42 3
51
4
5
1
34
21
3
1
3
5
42 3
51
4
5
1
34
21
3
2
A
B
A
BA now has zero probability of hitting an uninfected host
4
27
Full-Cycle Worm
• Uses a Linear Congruential Generator with a full period
• Uses Permutation Scanning
• Retires after hitting first already infected host
• Infection rate same as normal random-scanning worm
• Network footprint much lower
28
Permutation-Scanning
• Randomizes the real address space into a Permutation Ring
• Each freshly infected host starts scanning from a random location
• Retires upon hitting an already infected host
Real address space Permutation
ring
new host jumps
about to infect
activeactive
retiredGets
infected, jumps
29
• Find # (active hosts) scanning– effectively (X)– ineffectively (Y)
• Among the scans from the effective hosts (X), calculate how many are hitting uninfected hosts.
• Find how many X and Y hosts hit a pre-infected host (and retire).
Solution Outline
X1 X2
Y
coveredarea
30
Vulnerable Host Classification
31
Interaction among Infected Hosts while scanning
32
Final Propagation Model for Full-Cycle Worm
VttxtiVtf
Vttxtitf
ttxtiVtiVtf
ttxtiVttxtf
NVdtrf
eff
ineff
new
old
hit
)()()()(
))()(()()(
)()()()()(
)()()()()()(
α
αα
αα
−+−=
−−=
−+−−
=
−+−−
=
××=Y
X
X
α
(effective)
(ineffective)
Fraction (covered area)
33
Final Propagation Model for Full-Cycle Worm
0)0()0()0(,)0()0()0()()()(
)()()()(
)()()()()(
)()()()()(
)()()()()()()()()(
======+=
+=
−=
−=
−==
syxaitdytdxtda
ftytfftxtds
ftytftfftxtdy
fttftfftxtd
tfftxtftfftxtdxtfftxtdi
hitoldhit
hitineffnewhit
hiteffnewhit
oldhiteffnewhit
newhit
αφ
αα
infected
Retired
Active
34
Closed-Form Solution for Propagation of Full-Cycle Worm
infected
Active
Retired
Same as Random Scanning worm
35
Model Vs. Simulation
N = 223 V = 213 Φ (hitlist size) = 100
36
Scanning Peak Independent of the Hitlist Size
37
Current Status And
Timeline
38
Current Status
• Detecting ASCII WormsConference paper titled “DAWN: A Novel Strategy for Detecting ASCII Worms in Networks” accepted for IEEE INFOCOM 2008Conference paper titled “Analysis of Maximum Executable Length for Detecting Text-based Malware” accepted for IEEE ICDCS 2008
• Modeling Permutation ScanningConference paper titled “Exact Modeling of Propagation for Permutation-Scanning Worms” accepted for IEEE INFOCOM 2008
• Finding Optimal Scanning StrategyConference paper titled “Impact of Pseudo-Randomness on Internet Worm Propagation” under review for ACM SIGCOMM 2008
39
Questions
40
Thank you