internet outbreaks - university of california, san diego · 2007. 3. 13. · content prevalence : w...
TRANSCRIPT
1
Internet OutbreaksInternet OutbreaksEpidemiology and DefensesEpidemiology and Defenses
Geoffrey M. Geoffrey M. VoelkerVoelker
Collaborative Center for Collaborative Center for Internet Epidemiology and DefensesInternet Epidemiology and Defenses
(CCIED)(CCIED)
Computer Science and EngineeringComputer Science and Engineering
11
p g gp g gUC San DiegoUC San Diego
February 28, 2007February 28, 2007
With David Anderson, Jay Chen, With David Anderson, Jay Chen, CristianCristian EstanEstan, Chris , Chris FleizachFleizach, , RanjitRanjit JhalaJhala, , FlavioFlavio JunqueiraJunqueira, , Erin Erin KenneallyKenneally, Justin Ma, John McCullough, David Moore, Vern , Justin Ma, John McCullough, David Moore, Vern PaxsonPaxson (ICSI), Stefan (ICSI), Stefan Savage, Colleen Shannon, Savage, Colleen Shannon, SumeetSumeet Singh, Alex Singh, Alex SnoerenSnoeren, Stuart , Stuart StanifordStaniford (Nevis), (Nevis), AminAmin
VahdatVahdat, Erik , Erik VandekeiftVandekeift, George Varghese, Michael , George Varghese, Michael VrableVrable, Nick Weaver (ICSI), Qing Zhang, Nick Weaver (ICSI), Qing Zhang
Paradise LostParadise Lost
Our GoalOur GoalDevelop the understanding and technology to Develop the understanding and technology to address largeaddress large--scale subversion of Internet hostsscale subversion of Internet hosts
Yahoo! and UPF 2
2
Threat TransformationThreat Transformation
Traditional threats Modern threatsAttacker manually targets high-value system/resource Defender increases cost to compromise high-value systemsBiggest threat: insider attacker
Attacker uses automation to target all systems at once (can filter later)Defender must defend allsystems at once Biggest threats: software vulnerabilities & naïve users
Yahoo! 3
LargeLarge--Scale EnablersScale EnablersUnrestricted high-performance connectivity
Large-scale adoption of IP model for networks & appsg p ppInternet is high-bandwidth, low-latencyThe Internet succeeded!
Software homogeneity & user naivetéSingle bug mass vulnerability in millions of hostsTrusting users (“ok”) mass vulnerability in millions of hosts
Lack of meaningful deterrence
Yahoo! 4
Lack of meaningful deterrenceLittle forensic attribution/audit capability
Effective anonymityNo deterrence, minimal risk
3
Driving Economic ForcesDriving Economic ForcesEmergence of profit-making payloads
Spam forwarding (MyDoom.A backdoor, SoBig), Credit Card p g ( y , g),theft (Korgo), DDoS extortion, (many) etc…“Virtuous” economic cycle transforms nature of threat
Commoditization of compromised hostsFluid third-party exchange market (millions)
» Going rate for Spam proxying 3 -10 cents/host/weekSeems small, but 25k botnet gets you $40k-130k/yr
» Raw bots, .01$+/host, Special orders ($50+)
Yahoo! 5
Hosts effectively becoming a criminal platformInnovation in both host substrate and its uses
Sophisticated infection and command/control networksDDoS, SPAM, piracy, phishing, identity theft are all applications
Botnet Spammer Rental RatesBotnet Spammer Rental Rates
>20-30k always online SOCKs4, url is de-duped and updated every >10 minutes. 900/weekly, Samples will be sent on request.
M thl t d t di t i
3.6 cents per bot week
6 cents per bot week
>Monthly payments arranged at discount prices.
>$350.00/weekly - $1,000/monthly (USD) >Always Online: 5,000 - 6,000>Updated every: 10 minutes
Yahoo! and UPF 6
p
2.5 cents per bot week
>$220.00/weekly - $800.00/monthly (USD)>Always Online: 9,000 - 10,000>Updated every: 5 minutes
September 2004 postings to SpecialHam.com, Spamforum.biz
4
Why Worms?Why Worms?All of these “applications” depend on automated mechanisms for subverting large numbers of hostsmechanisms for subverting large numbers of hostsSelf-propagating programs continue to be the most effective mechanism for host subversionPrevent automated subversion severely undermine phishing, DDoS, extortion, etc.
Yahoo! 7
Our Goal: Develop the understanding and technology to address large-scale subversion of Internet hosts
TodayTodayWorm outbreaks
What are we up against?What are we up against?
Framing the worm problem…and solutionsWhat are our options?
Two worm detection and monitoring techniquesFundamental basis for understanding and defending against large-scale Internet attacksEarlyBird: High-speed network-based content sifting
Yahoo! 8
EarlyBird: High speed network based content siftingPotemkin: Large-scale high-fidelity honeyfarm
Current projects
5
Network TelescopesNetwork Telescopes
Idea: Unsolicited packets evidence of global phenomenaBackscatter: response packets sent by victims provide insight into
Yahoo! 9
p p y p gglobal prevalence of DoS attacks (and who is getting attacked)Scans: request packets can indicate an infection attempt from a worm (and who is current infected, growth rate, etc.)
Very scalable: CCIED Telescope monitors 17M+ IP addrs (> 1% of all routable addresses of the Internet)
2001: A 2001: A DoSDoS OdysseyOdysseyInferring global Internet DoS attacks using backscatter
4 000 DoS attacks/week everyone a victim intense periodic4,000 DoS attacks/week, everyone a victim, intense, periodic
Yahoo! and UPF 10Moore et al., Inferring Internet Denial of Service Activity, USENIX Security, 2001
6
2001: A Worm Odyssey2001: A Worm OdysseyCodeRed worm released in July 2001
Exploited buffer overflow in Microsoft IIS Infects 360,000 hosts in 14 hours (CRv2)
» Propagation is limited by latency of TCP handshake
Yahoo! and UPF 11Moore et al, CodeRed: a Case study on the Spread of an Internet Worm, IMW 2002 andStaniford et al, How to 0wn the Internet in your Spare Time, USENIX Security 2002
Fast WormsFast WormsSlammer/Sapphire released in January 2003
First ~1 min behaves like classic scanning wormFirst 1 min behaves like classic scanning worm» Doubling time of ~8.5 seconds
>1 min worm saturates access bandwidth» Some hosts issue > 20,000 scans/sec» Self-interfering
Peaks at ~3 min» >55 million IP scans/sec
90% f I t t d i 10 i
Yahoo! and UPF 12
90% of Internet scanned in <10 mins
Moore et al, The Spread of the Sapphire/Slammer Worm, IEEE Security & Privacy, 1(4), 2003
7
Was Slammer really fast?Was Slammer really fast?Yes, it was orders of magnitude faster than CodeRedNo it was poorly written and unsophisticatedNo, it was poorly written and unsophisticatedWho cares? It is literally an academic point
The current debate is whether one can get < 500msBottom line: way faster than people!
Yahoo! 13
Staniford et al, The Top Speed of Flash Worms, ACM WORM, 2004
Understanding WormsUnderstanding WormsWorms are well modeled as infectious epidemics
Homogeneous random contactsHomogeneous random contacts
Classic SI modelN: population sizeS(t): susceptible hosts at time tI(t): infected hosts at time tβ: contact ratei(t): I(t)/N, s(t): S(t)/N
Yahoo! and UPF 14
i(t): I(t)/N, s(t): S(t)/N
)(
)(
1)( Tt
Tt
eeti −
−
+= β
β
Staniford, Paxson, Weaver, How to 0wn the Internet in Your Spare Time, USENIX Security 2002
8
What Can We Do?What Can We Do?1) Reduce number of susceptible hosts S(t)
PreventionPrevention
2) Reduce number of infected hosts I(t)Treatment
3) Prepare for the inevitable NSurvival
4) Reduce the contact rate βC t i t
Yahoo! 15
Containment
PreventionPreventionReduce # of susceptible hosts S(t)Software quality: eliminate vulnerability
Static/dynamic testing [e.g., Cowan, Wagner, Engler]Active research community, taken seriously in industry
» Security code review alone for Windows Server 2003 ~ $200MTraditional problems: soundness, completeness, usability
Software updating: reduce window of vulnerabilityMost worms exploit known vulnerability (10 days 6 months)
» Sapphire: Vulnerability & patch July 2002, worm January 2003
Yahoo! 16
Some activity (Shield [Wang04]), yet critical problem Is finding security holes a good idea? [Rescorla04]
Software heterogeneity: reduce impact of vulnerabilityArtificial heterogeneity [Forrest02]Exploit existing heterogeneity [Junqueira05]
9
TreatmentTreatmentReduce # of infected hosts I(t)Disinfection: Remove worm from infected hosts
Develop specialized “vaccine” in real-timeDistribute at competitive rate
» Counter-worm, anti-worm Code Green, CRclean, Worm vs. Worm [Castaneda04]
» Exploit vulnerability, patch host, propagateSeems tough [Weaver06]
» Legal issues of using exploits, even if well-intentioned
Yahoo! 17
» Propagation race problem
Automatically patch vulnerability [Keromytis03], [Sidiroglou05]Auto-generate and test patches in sandboxApply within administration domainRequires source, targets known exploits (e.g., overflows)
SurvivalSurvivalPrepare for inevitable
Game of escalationGame of escalation
Approach: Informed replicationWorms represent large-scale dependent failuresModel software configurations model dependent failures
Replicate data on hosts with disjoint configurationsExploit existing software heterogeneityEven with software skew only need 3 replicas
Yahoo! 18
Even with software skew, only need 3 replicas
PhoenixCooperative backup system using informed replication
[Junqueira et al., Surviving Internet Catastrophes, USENIX 2005]
10
Reactive ContainmentReactive ContainmentReduce contact rate βSlow worm downSlow worm down
Throttle connection rate to slow spread [Twycross03]Important capability, but worm still spreads…
QuarantineDetect and block wormHow feasible is it?
Yahoo! 19
Defense RequirementsDefense RequirementsAny reactive defense is defined by:
Reaction time – how long to detect worm propagateReaction time how long to detect worm, propagate information, and activate responseContainment strategy – how malicious behavior is identifiedDeployment scenario – who participates in the system
Given these, what are the engineering requirements for any effective defense?
Yahoo! 20
for any effective defense?
[Moore et al., Internet Quarantine: Requirements for Containing Self-Propagating Code, Infocom 2003]
11
Containment RequirementsContainment RequirementsUniversal deployment for Code Red
Address filtering (blacklists), must respond < 25 minsg ( ) pContent filtering (signatures), must respond < 3 hours
For faster worms (slammer): secondsWorse for non-universal deployment…
Bottom line: very challenging (at global scale)
e
Yahoo! and UPF 21
Rea
ctio
n tim
e
Propagation rate (probes/sec)
Scalable Detection and Scalable Detection and MonitoringMonitoring
Detection and monitoring are fundamental for understanding and defending against wormsunderstanding and defending against wormsLessons from containment
Need to detect worms in less than a secondHow can we do this?
Know thy enemyWhat does the worm/virus/bot do?Who is controlling it?
Yahoo! 22
Who is controlling it?
12
Signature InferenceSignature InferenceChallenge: In less than a second…
Detect worm probesDetect worm probesCharacterize worm packets with a byte signature
ApproachMonitor networkIdentify packets with common strings spreading like a wormUse signature for content filtering
Yahoo! 23
Content SiftingContent SiftingAssume unique, invariant string W for all worm probes
Works today but not foreverWorks today, but not forever
ConsequencesContent prevalence: W more common in worm trafficAddress dispersion: traffic with W has many distinct src/dests
Content SiftingIdentify W with high prevalence and high dispersionUse W as filter signature in network
Yahoo! 24
Use W as filter signature in network
[Singh et al., Automated Worm Fingerprinting, OSDI 2004]
13
Content Sifting in Early BirdContent Sifting in Early BirdChallenges: Time and space
Must touch every byte in all packets (1 Gbps 12 us/packet)y y p ( p p )Simple algorithm consumes 100 MB/s of memory
Approach: Careful algorithms and data structuresIncremental hash functionsValue-based samplingMulti-state filters and multi-resolution and counting bitmapsCombined: 60 us/packet in software
Works well in practice
Yahoo! 25
Works well in practiceDeployed at UCSD CSE for 8 monthsDetected every worm outbreak reported on security listsIdentified unknown worms (Kibvu, Sasser)
Tech TransferTech TransferContent sifting technologies patented by UC and licensed to startup, Netsift Inc.licensed to startup, Netsift Inc.Netsift significantly improved performance, features
Hardware implementation, new capabilities
In June 2005, Netsift was acquired by Cisco
Yahoo! 26
14
Going FurtherGoing FurtherNetwork telescopes, content sifting have limitations
Passive observation no interaction with malwarePassive observation, no interaction with malwareLexical domain is limited» Evasion through polymorphism, protocol framing, encryption
Want to answer deeper questionsWhat does a worm/virus/bot do?What vulnerabilities are exploited, and how?Who is controlling it, how is it controlled?
Yahoo! 27
g ,
Alternative: Endpoint monitoring
Scalability/Fidelity TradeoffScalability/Fidelity Tradeoff
Telescopes + Responders
Live Honeypot
(iSink, honeyd, Internet Motion Sensor)
VM-based Honeynet(e.g., Collapsar)
NetworkTelescopes(passive)
Yahoo! 28
MostScalable
HighestFidelity
15
Can We Achieve Both?Can We Achieve Both?Naïve approach: one machine per IP address
1M addresses = 1M hosts = $2B+ investment1M addresses 1M hosts $2B+ investmentOverkill… most resources will be wasted
In truth, only necessary to maintain the illusion of continuously live honeypot systems
Yahoo! 29
Maintain illusion on the cheap usingNetwork multiplexingHost multiplexing
NetworkNetwork--Level MultiplexingLevel MultiplexingMost addresses are idle at any given time
Late bind honeypots to IP addressesLate bind honeypots to IP addresses
Most traffic does not cause an infectionRecycle honeypots if can’t detect anything interestingOnly maintain honeypots of interest for extended periods
One honeypot for every 100-1000 IP addresses
Yahoo! 30
16
HostHost--level multiplexinglevel multiplexingCPU utilization in each honeypot is quite low (<<1%)
Use VMM to multiplex honeypots on single machineUse VMM to multiplex honeypots on single machineDone in practice, but limited by memory bottleneck
Memory coherence propertyFew memory pages are actually modified in inputShare unmodified pages between VMs copy-on-write
One physical machine for 100-1000 honeypots
Yahoo! 31
Potemkin: A HighPotemkin: A High--Fidelity, Fidelity, LargeLarge--Scale HoneyfarmScale Honeyfarm
Gateway: Multiplexes traffic onto VM honeypotsPotemkin VMM: Multiplexes VMs on serversp
Yahoo! and UPF 32Vrable et al., Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm, SOSP 2005
17
Potemkin VMMPotemkin VMM
Modified Xen using shadow translate modeIntegrated into VT for Windows supportIntegrated into VT for Windows support
Clone manager instantiates frozen VM image and keeps it resident in physical memory
Flash cloning: memory instantiated via eager copy of PTE pages and lazy faulting of data pages (no software startup)Delta virtualization: copy implemented as copy-on-write (no memory overhead for shared code/data)
Supports hundreds of simultaneous VMs per host
Yahoo! 33
Supports hundreds of simultaneous VMs per hostOverhead: currently takes 200-500ms to create new VM
Imperceptible to human user and under TCP handshake timeoutWildly unoptimized (e.g., includes multiple Python invocations)
» Pre-allocated VM’s can be invoked in ~5ms
SummarySummaryInternet hosts are highly vulnerable to worm outbreaks
Millions of hosts can be “taken” before anyone realizesMillions of hosts can be taken before anyone realizes Supports vibrant ecosystem of criminal activity
Containment (Quarantine) requires automated responsePrevention is a critical element, but outbreaks inevitable
Need scalable detection, can also plan to survive (Phoenix)
Different detection strategies, monitoring approachesHi h d t k b d t t ifti (E l Bi d)
Yahoo! 34
High-speed network-based content sifting (EarlyBird)Large-scale high-fidelity honeyfarm (Potemkin)
Smart bad guys still have a huge advantageEscalation: Rapid innovation in both problems and solutions
18
Underground EconomyUnderground EconomyAcquisition, trade, liquidation of illicit digital goods
ccard phishing bots malware scamsccard, phishing, bots, malware, scams, …Online markets, market enablers, cash out, …
Hypothesis: Understanding the underground economy will help us develop/target technology
Where are economic bottlenecks? Where is value-chain brittle? Where are participants exposed? Transaction volume/price dynamism?
Data sourcesSpam, IRC feeds/Web forums, phishing drop sitesHave one spam feed (200K/day), developing relationships for others…but always looking for more data
Yahoo! 35
SpamscatterSpamscatterMonitor scam sites advertised in spam
Extract URLs to scams from spamProbe, download pages for a week
Identify multiple sites for the same scamImage shingling: tolerates ad rotation, etc.
WorkloadSpam from 4-letter TLD (200,000 spams/day)
What do we find?2,300 scams/week
Yahoo! 36
2,300 scams/week60% scam sites in U.S.
» (vs. 13% spam relays)Only 10% scams “malicious”
» (vs. pharm, s/w, merchandise, etc.)38% sites hosted multiple scams
19
Other ProjectsOther ProjectsSelf-moderating outbreaks (get 80% and stop) [Ma05]Prevalence of polymorphism in exploits [Ma06]p y p p [ ]Forensics with honeyfarm
Network dynamics, network and host supportData-centric attribution and policy enforcement
These files should not leave the corporate networkThese files always need to be encrypted on disk/networkAnd any objects derived from them (emails w/ attachments, cut-and-paste, etc.)p )Use generalized taint mechanisms and virtual machines
Privacy-preserving packet attributionAttribution: routers, hosts can verify packet sourcesPrivacy: …but not reveal contentsAttribution one step towards deterrence
Yahoo! 37
For More Info…For More Info…
http://www.ccied.org
Yahoo! and UPF 38