abstract we live in a world of viruses, worms, and browser threats that change and adapt on an...

33
Protecting the World with Big Data Bill Pfeifer Program Manager Microsoft Malware Protection Center September 2014

Upload: nigel-wells

Post on 24-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Protecting the World with Big Data

Bill PfeiferProgram ManagerMicrosoft Malware Protection CenterSeptember 2014

Page 2: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Abstract

We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who bring you Microsoft Security Essentials and Windows Defender, has built and maintained a Big Data solution to protect Windows customers. These efforts offer monitoring and tools for release management, cloud protection, automatic signature generation, and malware research.

Page 3: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

About Me

• Tlingit from southeast Alaska• Interested in electronics and

security from a young age• BS from University of Alaska,

Fairbanks• MS from Purdue• Member of AISES• Microsoft for 3.5 years

Tlingit totem pole and community house in Totem Bight State Park, Ketchikan, Alaska.Credit: Bob and Ira Spring

Page 4: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

MS Malware Protection Center

Page 5: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Offerings• Microsoft Security Essentials

• Windows Defender

• System Center Endpoint Protection

• Office 365 Protection

• Azure Protection

• Windows Store protection

Page 6: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

• Protect the unprotected • Remain security vendor agnostic

• Publish world-class security content, world security posture

• Remove malware value proposition

• Reduce malware’s reach and life span with cloud protection, machine learning, automation, and faster sample collection

• Identify and work with partners to eliminate malware monetization schemes

• Drive increased malware sample, telemetry, knowledge sharing

• Formalize strategic partner relationships with vendors, CERTs, E-commerce, application vendors and distributors

• Sponsor coordinated malware eradication campaigns

MMPC main goals

Disrupt malware ecosystem

Help ensure all Microsoft customers are protected

Build a strong and united ecosystem

Page 7: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

PROTECTION SENSORS

Windows 8+ Defender 55MWindows 7- MSE 94M

Enterprise SCCM, Intune 6M

MSRT Monthly cleanup 1.2B

Azure, Office 365 .Windows 7- Defender 309M

DAILY675K new samples250M cloud calls12 sig releases

RESULTS79% protected13% encounters3.6% infected122M unique files

Too many expired AVs on Windows 8+

Malware out-paces sigs

GOALSEnsure all of Microsoft’s customers are protected

• Measure, push user-not-protected scenariosEradicate malware

• Apply new protection techniques• Amplify researchers with automation• Block the first time with the cloud

Lead antimalware ecosystem• Drive appropriate behavior• Coordinate activities across industry and ecommerce players• Fix testing perception, testing approach

Page 8: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

8

The usual suspects• Malware families

rarely die: 466 make up top 99% of infections• Disruption helps, but

most families come back…• … and when they do,

they come back more resilient

Inefficiency: Lingering malware infections

Page 9: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Encounters vs. Infections

Data from Microsoft real time protection clients

Heat map shows rate of encounters (Blue->Green->Yellow->Red)

Country color signifies % of customers with infections

Of note Flooding works: more

encounters mean more infections

World-wide: 9% encounters, 3% infections

The trick is to stop the encounters

http://www.microsoft.com/security/sir/threat/default.aspx

Page 10: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Family Encounters Infections Industry MissesJenxcus 1,804,868 188,540 177,523 OptimizerElite 195,846 157,777 - Zbot 237,470 143,353 34,107 Brantall 433,403 113,526 - Wysotot 536,452 105,791 186,541 Rotbrow 508,577 100,354 271,326 Necurs 127,059 96,505 - Sality 455,342 91,492 50,485 Rovnix 85,228 71,019 - Kilim 196,326 67,871 1,198 Ramnit 437,128 67,171 30,629 Upatre 110,502 64,375 - Clikug 441,677 62,438 - Gamarue 920,694 58,110 26,624 Virut 223,311 57,594 2,516 Filcout 2,791,531 52,555 7,495,272 Spacekito 77,385 51,652 53,379 Napolar 156,702 51,492 1,973 Alureon 72,305 50,762 3,860 Dorkbot 471,114 48,697 8,128

Threat Family reports

Page 11: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Antimalware automation

Big Datasamples,

telemetry, reputation,

determinations

Analysis

Auto-classification

Signature generation

Telemetry response

Industry- Samples- Meta-data- Reputation- Determinations

Collection

Customers- Telemetry- Samples

Collection- Industry and customers- Automatic and on demand

Big Data- Samples- Map reduce- Processed/Workflow

Analysis- Dynamic and Static- Vendor rescans/determinations- Human-supplied patterns

Auto-classification- Combine analysis with reputation- Assign determination, family- Feeds sig-gen and cloud protection

Signature Generation- Best-fit signature- Static and proactive- Signature release pipeline

Telemetry Monitoring- FP detection- Never unknowns- Sample requests

Page 12: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Business Intelligence Team

Query Masters• Dashboards• Livesite reporting• PoR meetings• Researcher tools• Query Optimizations

Page 13: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 14: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 15: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 16: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Data Infrastructure

Page 17: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Multiple data sources• Windows Update• Watson Error Reporting• Software Quality Metrics• Telemetry Threat/suspicious

reports

Page 18: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 19: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 20: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Features

Page 21: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Storage & Usage Numbers

Threat Telemetry

Raw data 2 TB per day 360 TB for ½ yearReduced 200 GB per day 36 TB for ½ year

More than 200 engineers and researchers on the protection team

2 Cosmos Clusters 3 VC instances each

2 PBs stored between clusters

10K job/day

1.5K adhoc jobs/day

4.5 PB read/day by adhoc

Queue wait time of 2 minutes

Page 22: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

First Impressions

Page 23: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Issues we found

Missing features• No coding guidelines• Limited shared libraries• No Discoverability• No scheduling

Impact• Duplication of work• Duplication of data• Long execution time

Page 24: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Expensive operations

CROSS APPLY(De-serializing rows)

CLUSTER BY(Partitioning the storage)

Page 25: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Data Skews

Page 26: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who
Page 27: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Evolution

Page 28: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

How things started to evolve

Intermediate outputs for multi-stage jobs• Rerun against the middle outputs while developing

Documenting reusable data streams (lookup tables & contextual streams)Caching historical data

• Need to write stream sets to join over date ranges

Creating views over the cachesCreating libraries

• Common processors (strip the Threat Family Name out of the Threat Name)• Contextual meta data (geolocation)• Enumerations

Stopgap Scheduling • Task scheduler

Page 29: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Formalizing a Data Model

4Metrics/KPIs Views

Lookup Profiles

Metrics/KPI Streams

3Aggregate Views

Aggregate Streams

2Filter Views

Filter Streams

1Curated Views

Curated Streams

0 Raw

File ProfileKey: Sha1/Sha256Provides: First Seen dates, Prevalence, Top sigseqs, Top filenames, etc.

Family Profile Key: Family NameProvides: Family owners, Class, Machine Impacts, etc.

Device ProfileKey: Machine GUIDProvides: Heartbeat rate, City, State, Country, Platform, Top Threat IDs, etc.

Filename ProfileKey: FilenameProvides: Ancestors, Top Threat Ids, Top Sigs, First Seen, etc.

Signature ProfileKey: SigSeqProvides: Last check-in date, author, prevalence, family association, etc.

Sample Source ProfileKey: Source NameProvides: count of samples, efficacy of source, rate of samples etc.

URL ProfileKey: URLProvides: Top Threat Ids, Top Sigs, First Seen, family association, etc.

IP ProfileKey: IPProvides: Top Threat Ids, Top Sigs, First Seen, family association, etc.

Page 30: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Example Metrics: PSL/ESL

4 PSL / ESL KPI

Lookup Profiles

3 MissesAggregate

ActivesAggregate

Incorrect DetectionsAggregate

FailuresAggregate

EncountersAggregate

2 Missesview

Activesview

Incorrect Detectionsview

Failures view

EncountersView**

1

Canonical Telemetry view*

File Report view

Memory Report view

Boot-removal Reportview

Boot Reportview

Rootkit Reportview

File Report Memory Report Boot-removal Report Boot Report Rootkit Report

0 Raw Telemetry

4Metrics/KPIs Views

Lookup Profiles

Metrics/KPI Streams

3Aggregate Views

Aggregate Streams

2

Filter Views

Filter Streams

1Curated Views

Curated Streams

0 Raw

PDMCalculating the PSL, ESL based on the Protection Data Model

Page 31: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

Operationalizing• Monitoring

• Job execution• Output stream creation• Dashboard creation

• Automated scheduling• Sangam workflows

• Production level libraries• Common source

• Production level caches• SLA on bug fixes, breaking change notification

• Documentation library• MSDN-like docs for reference and discoverability

• Testing framework

Page 32: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

What is next?

• AB Testing• Increased production job stability• Increase agility

• Automatic Dashboard generation• Custom views for each researcher

• Rule based query generation• Parallel logic across single data set

Page 33: Abstract We live in a world of viruses, worms, and browser threats that change and adapt on an hourly basis. Learn how Microsoft’s Protection Team, who

© 2013 Microsoft. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.