distributed sensor data contextualization for threat intelligence analysis
TRANSCRIPT
![Page 1: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/1.jpg)
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
Jason TrostJanuary 12, 2016
![Page 2: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/2.jpg)
whoamiJason Trost• VP of Threat Research @ ThreatStream• Previously at Sandia, DoD, Booz Allen, Endgame Inc.• Background in Big Data Analytics, Security Research, and Machine
Learning• Big advocate and contributor to open source:
• Modern Honey Network, BinaryPig, Honeynet Project• Apache Accumulo, Apache Storm, Elasticsearch
• 3rd time participating at FloCon (2013, 2015, 2016)
![Page 3: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/3.jpg)
ThreatStream• Cyber Security company founded in 2013 and venture backed by
Google Ventures, Paladin Capital Group, Institutional Venture Partners, and General Catalyst Partners.
• SaaS based enterprise security software that provides actionable threat intelligence to large enterprises and government agencies.
• Our customers hail from the financial services, healthcare, retail, energy, and technology sectors.
![Page 4: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/4.jpg)
Agenda•Background• Sensors•Enrichment•Contextualization•Wrap up
![Page 5: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/5.jpg)
Background• Huge proliferation of new and old network sensors
• IDS, Passive Inventory Systems, Malware Sandboxes• Honeypots, DNS Sinkholes, Endpoint agents• Netflow, Packet logging, etc
• Many useful data enrichment sources• Passive DNS (PDNS), Whois, IP Geolocation• Large Malware Metadata Repositories• Network Telescopes / Distributed Sensors / Honeypots• Port scan and Web crawl data repositories• Internal IT Management, Security, and IR Systems• Vulnerability Databases
• Huge talent shortage in Security, lots of need to make existing analysts better and reduce bar for new analysts
• Lots of opportunities for combining these data sets, interpreting them, and contextualizing events for threat researchers and SOC analysts
• Data overload if not leveraged carefully• This research started with Honeypots, expanded to other events …
![Page 6: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/6.jpg)
Enrichment• Datasets that are useful for joining with events• Both local and external datasets can be useful• Very useful as features for machine learning models• Examples:
• Whois• Passive DNS• Active probing data repositories (port scan, traceroute, web crawl)• Malware Metadata Repositories• Threat Intelligence Knowledgebase• Rollups, Analytics, Facts from your sensors (e.g. netflow, IDS)• Internal IT management, Security, and IR Systems
![Page 7: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/7.jpg)
Contextualization• Gather details and related information to make an event or an
indicator more actionable• Guide the analyst towards best practices• Help analysts work faster/better• Encode expert knowledge in the analytics and presentation• Building blocks for more automation, decision support, and features
for classifiers
![Page 8: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/8.jpg)
Sensor Combinations
![Page 9: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/9.jpg)
Honeypots• Software systems designed to mimic vulnerable servers and desktops• Used as bait to deceive, slow down, or detect hackers, malware, or
misbehaving users • Designed to capture data for research, forensics, and threat
intelligence• Also useful as sinkhole servers when paired with DNS RPZ
![Page 10: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/10.jpg)
Why Honeypots• Cheapest way to generate threat intelligence feeds around malicious IP addresses at scale• Internal deployment
• Behind the firewall• Low noise IDS sensors• Can be used in conjunction with DNS RPZ as sinkhole webserver
• Local External deployment• Who is attacking me?• Outside the firewall and on your IP space
• Global External deployment• Rented Servers, Cloud Servers, etc.• Who is attacking everyone?• Global Trends
![Page 11: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/11.jpg)
Modern Honey Network (MHN)• Open source platform for managing honeypots, collecting
and analyzing their data• https://github.com/threatstream/mhn
• Makes it very easy to deploy new honeypots and get data flowing• Leverages some existing open source tools
• hpfeeds• nmemosyne• honeymap• MongoDB• Dionaea, Amun, Conpot, Glastopf • Wordpot, Kippo, Elastichoney, Shockpot• Snort, Suricata, p0f
![Page 12: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/12.jpg)
Beyond Honeypot Sensors• Malware Sandboxes• Sinkholes• Endpoint Security Products• Intrusion Detection Systems• Protocol Analyzers/Decoders• Passive Device Inventory/Fingerprinting
![Page 13: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/13.jpg)
Malware Sandbox
• Dynamic Execution of Malware to gather IOCs, record execution traces, look for malicious activity
• Deploy IDS on Malware Sandbox (Detonate files or URLs)• Signatures Identify some types of C2 network traffic• Identify Exploit Kit traffic (CVE tagger)• Identify sinkhole IPs passively
• Extract indicators, CVEs, Context, make associations• Any future event regarding these IOCs on your network should be
enriched with this context
![Page 14: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/14.jpg)
Sinkholes• High interaction systems that mimic real services and C2 protocols where possible. Used to identify
compromised systems• Conceptually similar to honeypots, but you drive traffic to them through RPZ
• Use IDS to analyze sinkhole traffic• tag traffic where possible with C2 protocols
• Deploy with p0f to gather host metadata (operating system, uptime, service banners)• Local Deployment
• Use RPZ to sinkhole known malicious / suspicious domains• Malware C2• Dynamic DNS domains• Exploit kit domains
• Identify internal compromised systems• External Deployment
• Register expired malicious domains or seize them• Identify infected systems across the globe
![Page 15: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/15.jpg)
Automated Incident Response Collection• Starting Point: Policy Violation, Network IDS Alert, Honeypot Sensor
Event, DNS Sinkhole hit, Indicator Match in SIEM, etc. • Automatically collect host based data, esp related to the network
event• Logged in users• Running processes• DNS cache• Open network connections• Persistence checks• Prefetch files
• Diff the collected data against the previous collection or a “gold image”
• Prepare context for analyst
![Page 16: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/16.jpg)
Enrichments
![Page 17: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/17.jpg)
Enrichments: Whois• Domain registration data• Query the whois system on-demand (heavily rate limited), query 3rd party
providers (pay-per-query), or buy bulk database for offline queries/mining
• Who registered this domain?• Was this domain registered with a free email provider?• Was this domain registered with a disposable email provider?• Privacy protected?• Is this domain likely sinkholed?• Registration data congruent?
![Page 18: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/18.jpg)
Enrichments: Internal IT, Security, and IR Systems• Identity Information• Asset Data
• Specific Device• Owner• Device Characteristics• Software Inventory
• Asset Discovery Data• Governance Risk and Compliance (GRC) Systems• Related Incident Response Tickets
![Page 19: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/19.jpg)
Enrichments: Passive DNS (PDNS)• What other domains resolved to this IP?• What other IPs did this domain resolve to?
• Is this domain sinkholed?• Is this a parking IP?• Is this domain resolving to an IP using DHCP?• Fast flux domain?• Often useful to combine with Whois
• Common registrant across most domains resolving to single IP? -> Sinkholed• Nameserver name contains “sinkhole”, “abused”, “seized”? -> Sinkholed• Diverse registrants, common registrar? -> Parking IP (or Shared Hosting)• Diverse registrants, uncommon registrar? -> Shared Hosting IP
![Page 20: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/20.jpg)
Enrichments: Active Probing Data• Internet scale Port scan, Web crawl, traceroute
Repositories• Build your own or leverage 3rd parties• Host profile
• Web server?• Embedded Device?• IOT Device?• Router?• Workstation?
• C2 Panel?• Vulnerabilities?
• Many can be determined unobtrusively• Signature Database needed
• Sinkhole? • X-Sinkhole header
• SSL Cert Metadata
![Page 21: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/21.jpg)
Contextualization
![Page 22: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/22.jpg)
Honeypot Attacker Profile?• p0f events?
• OS?• Linux or Windows or other?
• Uptime?• short (less than 1 day)?• long (weeks or more)?
• MTU?• Cable?• DSL?• VPN/tunneled?
• Query PDNS for the IP, filter for recent resolutions• Large number of diverse domains? could be a web server
• Query Port scan repository• recent port 80/443 open?
• Query threat intelligence knowledge database• TOR?• I2P?• Commercial VPN?• Open or Commercial proxy?
Infected windows workstation?• home / work
Compromised webserver?• shared hosting?• dedicated?
Ephemeral scanning/exploitation server?
Long running scanning server (Shodan, Censys, ZoomEye, TOR nodes)?
![Page 23: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/23.jpg)
Compromised System – How?• Attacker using a compromised system?• Compromised web server?
• Port scan/Web crawl DB: port 80/443/8080 open?• Query PDNS: lots of recent domains, could be shared hosting
• Compromised mail server? Query PDNS• Port scan/Web crawl DB: port 25/110/995/143/993 open?• domains with mail*, smtp*, pop* subdomains?
• Uptime measurement from p0f?• days/weeks/months?
• How did they get in? Query port scan/web crawl data repository• Wordpress / Joomla / Drupal?• Cpanel / Webmin / Vestacp / Ispconfig / Virtualmin / Ajenti?• SSH brute force?• IOT device?
![Page 24: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/24.jpg)
Campaign Scope?• Is this IP attacking just me?• Are they attacking my vertical?• Are they attacking everyone?• Distributed Honeypots or sensors (or data sharing) are key here
• Query external global deployment• Query external local deployment• Combine Events and summarize
• first seen / last seen / number of sensors hit / ports involved• histogram of activity• Summary of exploits used, tools dropped & related C2s
![Page 25: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/25.jpg)
Attacker Toolkit
• Deploying Honeypots with IDS can assist here• Snort/Suricata are really useful for adding more context
• CVE Tagging – roughly 1/3 of the Emerging Threat Snort Rules have CVEs• Classify traffic, fingerprint of attack tools?
• Honeypots should collect exploit payloads and commands attempted• Windows and Linux Malware Sandboxing
• Execute these commands/scripts (often times wget + execute)• Save all payloads• Extract host and network IOCs• Maintain relationship to original attacker IP
• Query toolsets in VirusTotal
![Page 26: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/26.jpg)
Gotchas
• False positives• Adversarial manipulation• Whitelists• Lots of dead ends, pointing these out to analysts is important• Rate limiting of enrichments
![Page 27: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/27.jpg)
Conclusion• Huge proliferation of network sensors and enrichment datasets• Combining this data is useful• Lots of opportunity to make security analysts better/faster
• pre-gather context for user• point out gotchas/dead ends• guide analyst to best practices
![Page 28: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/28.jpg)
ContactJason Trost•@jason_trost• jason [dot] trost [AT] threatstream [dot] com• https://github.com/jt6211
![Page 29: Distributed Sensor Data Contextualization for Threat Intelligence Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013013/5878720c1a28ab497b8b65d1/html5/thumbnails/29.jpg)
Questions
???