introspective networks george varghese university of california, san diego
Post on 18-Dec-2015
216 views
TRANSCRIPT
Introspective Networks
George Varghese
University of California, San Diego
1. Basic: stateless, transparent.
Tools: protocol design (e.g., soft-state)
2. Active: customizable, re-configurable
Tools: Code Safety (e.g., sandboxing)
3. Cognitive: intelligent, reasoning
Tools: AI (e.g., multi-agent systems)
4. Introspective: pattern detection/response
Tools: Streaming algorithms, statistical inference (e.g. Bloom Filters, sampling)
Network Evolution?
What is Introspection?
Detecting patterns in data traffic, either in real-
time or based on packet logs. Examples:
Measurement Introspection: Identify resource
usage patterns for better resource management
Security Introspection: Identify attack patterns to
mitigate or prevent attacks.
Fault Introspection: Identify fault or anomaly
patterns to allow automated fault repair.
Motivated by market pull and technology push
Market Pull 1: Better ROI for ISPs
• Better ROI: Optimize resources (BGP policy, OSPF weights, light up fibers, add bandwidth) based on resource usage patterns.
• Better Isolation: Better QoS (200 msec versus 2000 msec delay for during Slammer) during attacks is major differentiator.
• Competitive Edge: Just as banks use data mining to better manage loan portfolios, can better manage “bandwidth portfolio”.
Sprint Monitoring Proposal, IETF BOF 2003
ISP
Customer Site 1Customer Site 3
Customer Site 2reroute or add B/W
Market Pull 2: Costs of (In)Security
• Cost: Too many isolated perimeter solutions (firewalls, IDS devices, patches). Total cost of ownership (TCO) very high.
• Delay: When perimeter detects, damage is already done.• Complexity: End users finding and installing patches; or require router
support for traceback which could be used for detection.
Gartner Research: Security solutions deployed within enterprises by 2004 and within ISPs by 2006
ISP
Attacker
Zombie 1
Zombie N
Victim
IDS
Firewall(patches)
traceback
Technology Push: Streaming Algorithms and Hardware Gates
• Algorithms: Recent major thrust in streaming algorithms in database, web analysis, theory, networks
• Hardware: Memory accesses remain expensive (< 100) and SRAM not scaling as fast as number of connections (< 32 Mbits), but gates are plentiful.
• Mapping: Many randomized streaming algorithms (e.g., Bloom Filters, Min-wise hashing) developed to find patterns in disk logs map well to network ASICs.
• Opportunity: Invent or adapt streaming algorithms for networking patterns.
Concerns about Network Introspection
• Speed: Can hardware run fast enough? Recall IP lookups in 1990’s, surprisingly complex things (branch
predictors, TCP Offload) being done routinely today.
Even if not, can use algorithms to mine packet logs offline for insight.
• Inflexible: Hardware not easy to change. Design hardware to identify useful “primitive” patterns that can be
combined.
Network Processors (ISCA 2003) can offer flexibility & speed.
• End-to-end argument: Not simple, stateless core. Not required for correctness of basic forwarding, but only as an
optimization or value-add.
Introspection as Pattern Detection
• Within Packet Patterns: Prefix matches, classification, signature detection (e.g., Code Red Payload)
• Across Packet Patterns: Scheduling, Timing, Heavy-hitters, large flows, partial completion.
S1 S2 S2 S5 S2 S1ROUTER
Pattern Detection Algorithm Requirements
• Low memory: On-chip SRAM limited to around 32 Mbits. Not constant but is not scaling with number of concurrent conversations.
• Small processing: For wire-speed at 40 Gbps, using 40 byte packets, have 8 nsec. Using 1 nsec SRAM, 8 memory accesses. Factor of 30 in parallelism buys
240 accesses.
Talk Outline
• Part 1: Motivation
• Part 2: Basic Patterns and Algorithms (heavy-
hitters, many flows, partial completion)
• Part 3: Combining patterns to solve useful
application problems
• Part 4: Conclusions.
Pattern 1: Heavy-hitters
Heavy-hitters: In a measurement interval, (e.g., 10 minutes) detect the flows (e.g., sources) on a link that send more than a threshold (say 1% of the traffic) on a link.
S1S6 S2 S5S2 S2
Source S2 is 30 percent of traffic sequence
Estan,Varghese, ACM TOCS 2003
Field Extraction
Comparator
Comparator
Comparator
CountersHash 1
Hash 2
Hash 3
Stage 1
Stage 2
Stage 3
ALERT ! If
all countersabove
threshold
HeavyHitters via Multistage Filters
Increment
Multistage filters in Action
Grey = other flowsYellow = small flow
Green = large flow
Stage 1
Stage 3
Stage 2
Counters
Threshold. . .
Multistage Filter Analysis
Assume 1 percent threshold. Bound probability that a flow F of
0.1 % or less gets through 6 stages of size 1000 each.
• Why trouble?: F can fall into a ``hot'' bucket if and only the sum of traffic of all other flows in that bucket is morethan 0.9 %
• Single stage probability: At most 100/0.9 = 111 bucketsthat can be over 0.9 % before we bring on F. Thus probability F falls in a ``hot'' bucket is less than 111/1000 = 0.111
• Multistage probability: To be branded, F must beunlucky in all 6 stages with a probability of no more than0.111 6 which is very small. Thus at most 1000 false positiveswith very high probability.
Pattern 2: Partial Completion
Partial Completion: In a measurement interval, detect the flows (e.g., destinations) which have several Start Packets (e.g., SYN) without the corresponding End (e.g., FIN).
Destination X has 3 partial completions in sequence
SYNx SYNY SYNz FINY SYNx SYNx FINZ
Field Extraction
Comparator
Comparator
Comparator
CountersHash 1
Hash 2
Hash 3
Stage 1
Stage 2
Stage 3
ALERT ! If
all countersabove
threshold
Partial Completion Filters
Increment for SYN, Decrement for FIN
Interval 1 Interval 2 Interval 3 Interval 4
Long Lived Connection
SYNy Retransmissions
FINz
Retransmissions
SYNxFINx
Analysis 1: Benign but Malformed Connections
Model benign but malformed connections as addingextra SYN or FIN to an interval with probability 0.5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
Greater than 6
Probability of falsepositives = 0.0013
Probability of falsenegatives = 0.0013
Analysis 2: using Gaussian approximation
Counter Values
Pro
babi
lity
Pattern 3: Many Flows
Many Flows: In a measurement interval, find if number of flows exceeds a threshold.
S1S6 S2 S5S2 S2
6 packets but only 4 distinct sources
Simple Bitmap counting
Problem: bitmap takes too much memory to count a large number of flows
Hash based on flow identifierF
1 1 1
Estimate: based on the number of bits set
111 1
Sampled Bitmap counting
Problem: inaccurate if too few or too many flows
Solution: keep only a sample of the bitmap
1 1
Estimate: scale up sampled count
Multi-resolution Bitmap counting
Solution: multiple bitmaps, each covering a different range
Estimate: use first bitmap that has less than 93.1% of its bits set, count, scale
1-10 flows
10-100
100-1000
Outline of Talk
• Part 1: Motivation• Part 2: Basic Patterns and Algorithms• Part 3: Combining base patterns to solve
useful application problems (traffic matrix, DoS, worms)
• Part 4: Conclusions.
Application 1: Traffic Matrix
• Each entry router uses a multistage filter on traffic to destination prefixes to isolate subnets to which there is large traffic.
• Aggregating across all entry routers gives the “dominant” part of traffic matrix. ATT reports 80-20 rule for prefixes.
ISP
Customer Site 1Customer Site 3
Customer Site 2reroute or add B/W
Application 2, Process Logs to Find Large Bandwidth Usage Patterns
Multidimensional analysis via our tool
Old methods look at a single dimension at a time
Estan,Savage,Varghese, SIGCOMM 2003
Application 3: DoS Attacks
• Bandwidth attacks: (e.g.. Smurf). Pound victim with large traffic of certain type. Use heavy-hitter pattern relative to traffic type
(e.g., ICMP) to find attacked destinations
• Partial Completion attacks: (e.g., TCP SYN-Flood). May not be unusual bandwidth but characterized by partial connections. Use partial completion pattern?
NetworkCore
Attacker ISP
Attacker 1
AttackerISP
VictimISP
Back-Scatter Detection
Attacker n
Victim
Syn-KillSyn-Defender
MultopsSyn-cookie/cache
Syn-Dog
TraceBack
OUR SOLUTIONPartial Completion Filters in network
Syn-Flood Detection Options
NetworkCore
Attacker ISP
Attacker 1
AttackerISP
VictimISP
Back-Scatter Vantage Point
Attacker n
Victim
Destination based SYN-FIN PCF for detection
and defense (can be spoofed)
Source based SYN-ACK/FIN PCF for BackScatter
detection (Spoof-Proof)
PCF Deployment Options
Application 4: Worm Detection
• Concrete approaches to worm containment: routers block
packets with specific code signature.
• Manual signature extraction: slow and enormous effort for each
new worm.
• Automatic signature extraction of a specific worm by
automatically detecting an abstract worm.
ISP
Infected 1
Infected N
New Victim
Inactive Address
Abstract Worm Definition
• F1, Content Repetition: Payload of worm is seen frequently at router.
• F2, Increasing Infection Levels: Same content is disbursed to increasing number of distinct source-destination pairs.
• O1, Random Probing: Worm replicates by probing random IP addresses.
• O2, Code fragments: Worm payload contains content that has some resemblance to code.
Abstract Worm Detection
• F1, Content Repetition: Use heavy-hitter pattern with hash H of content as index.
• F2, Increasing Infection Levels: Use many flows pattern with content hash H as index.
• O1, Random Probing: Count dests sent with H in sample unused space (Telescope, Moore et al)
• O2, Code fragments: Simple offline tests that test say for 8086 control transfer op-codes.
First 3 tests need low memory, small processing
Spectre of Polymorphism
• Syntactic Polymorphism: Fragmentation on links with diff MTU sizes, offsets, No-Ops (use Rabin fingerprints at sampled offsets but does not help in case of encryption.)
• Semantic Polymorphism: Code rewriting at each new source (hard to detect, but raises bar to include a small compiler with worm payload.)
EarlyBird Experience
• System: Uses 39 byte Rabin fingerprints on tcpdump, looks for content repetition above low threshold, large memory currently.
• Deployment: sniffs on uplink of lab switch. 9 day period between May 2nd and May 10th 2003. 4 million packets
• Latent Worms Found:-- (742 pairs) TCP/139 NetBios Attack-- (51 pairs) Code Red TCP/80 GET /default.ida
-- Linux Slapper and 1 Unicode exploit• False positives: "robots.txt", ``SSH-1.99-3.1.1 SSH
Secure Shell for Windows'‘, some VNC strings
Recent Experience with EarlyBird
• On Aug 11th, Monday afternoon, found 133 repetitions of content for an RPC service. Lab machines stayed up but received many infection attempts
• Major security companies were already on the lookout for this, so MSBlaster was detected quickly.
• On the evening of Monday Aug 11th, my home computer began rebooting every few minutes saying “mumble RPC mumble”
Conclusions
• Measurement introspection can improve ISP ROI and security introspection can reduce TCO.
• Can implement base patterns at high speeds.
• Base patterns can be combined to solve useful application issues (traffic matrix, DoS, worms, etc.)
• Only scratching surface: fault introspection, etc.,
Joint work with collaborators
• Stefan Savage (AutoFocus, EarlyBird)• Students in Internet Algorithmics Lab:
Ramana Kompella Cristian Estan Sumeet Singh