carnegie mellon university notice · building and scaling a malware analysis system 1 sei webinar...

34
1 Building and Scaling a Malware Analysis System SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited Carnegie Mellon University Notice This video and all related information and materials (“materials”) are owned by Carnegie Mellon University. These materials are provided on an “as-is” “as available” basis without any warranties and solely for your personal viewing and use. You agree that Carnegie Mellon is not liable with respect to any materials received by you as a result of viewing the video, or using referenced websites, and/or for any consequences or the use by you of such materials. By viewing, downloading, and/or using this video and related materials, you agree that you have read and agree to our terms of use (www.sei.cmu.edu/legal/ ). Distribution Statement A: Approved for Public Release; Distribution is Unlimited © 2016 Carnegie Mellon University.

Upload: others

Post on 21-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

1Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Carnegie Mellon University NoticeThis video and all related information and materials (“materials”) are owned by Carnegie Mellon University. These materials are provided on an “as-is” “as available” basis without any warranties and solely for your personal viewing and use.

You agree that Carnegie Mellon is not liable with respect to any materials received by you as a result of viewing the video, or using referenced websites, and/or for any consequences or the use by you of such materials.

By viewing, downloading, and/or using this video and related materials, you agree that you have read and agree to our terms of use (www.sei.cmu.edu/legal/).

Distribution Statement A: Approved for Public Release; Distribution is Unlimited

© 2016 Carnegie Mellon University.

Page 2: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

2Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Copyright 2016 Carnegie Mellon University

This material is based upon work funded and supported by the Department of Defense under Contract No. FA8721-05-C-0003 with Carnegie Mellon University for the operation of the Software Engineering Institute, a federally funded research and development center.

Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Department of Defense.

NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN “AS-IS” BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

[Distribution Statement A] This material has been approved for public release and unlimited distribution. Please see Copyright notice for non-US Government use and distribution.

This material may be reproduced in its entirety, without modification, and freely distributed in written or electronic form without requesting formal permission. Permission is required for any other use. Requests for permission should be directed to the Software Engineering Institute at [email protected].

CERT® is a registered mark of Carnegie Mellon University.

DM-0003991

Page 3: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

3Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Software Engineering InstituteCarnegie Mellon UniversityPittsburgh, PA 15213

Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Building and Scaling a Malware Analysis SystemBrent Frye

Page 4: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

4Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware is Malicious Software

“Malware, also known as malicious code and malicious software, refers to a program that is inserted into a system, usually covertly, with the intent of compromising the confidentiality, integrity, or availability of the victim’s data, applications, or operating system or otherwise annoying or disrupting the victim.”• NIST SP800-83

Page 5: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

5Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

PE File Collection in Artifact Catalog, pre June 2005

• Less than 4000 PE files in collection• Manual collections• Manual analysis• Linear growth?

0

500

1000

1500

2000

2500

3000

3500

400020

0111

2002

0220

0205

2002

0820

0211

2003

0220

0305

2003

0820

0311

2004

0220

0405

2004

0820

0411

2005

0220

0505

Total PE Files

Page 6: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

6Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

PE File Collection in Artifact Catalog

0

10000000

20000000

30000000

40000000

50000000

60000000

7000000020

0111

2002

0220

0205

2002

0820

0211

2003

0220

0305

2003

0820

0311

2004

0220

0405

2004

0820

0411

2005

0220

0505

2005

0820

0511

2006

0220

0605

2006

0820

0611

2007

0220

0705

2007

0820

0711

2008

0220

0805

2008

0820

0811

2009

0220

0905

2009

0820

0911

2010

0220

1005

2010

0820

1011

2011

0220

1105

2011

0820

1111

2012

0220

1205

2012

0820

1211

2013

0220

1305

2013

0820

1311

Total PE Files

Page 7: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

7Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

PE File Collection in Artifact Catalog, log scale

1.00E+00

1.00E+01

1.00E+02

1.00E+03

1.00E+04

1.00E+05

1.00E+06

1.00E+07

1.00E+0820

0111

2002

0220

0205

2002

0820

0211

2003

0220

0305

2003

0820

0311

2004

0220

0405

2004

0820

0411

2005

0220

0505

2005

0820

0511

2006

0220

0605

2006

0820

0611

2007

0220

0705

2007

0820

0711

2008

0220

0805

2008

0820

0811

2009

0220

0905

2009

0820

0911

2010

0220

1005

2010

0820

1011

2011

0220

1105

2011

0820

1111

2012

0220

1205

2012

0820

1211

2013

0220

1305

2013

0820

1311

Total PE Files

Page 8: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

8Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Overview of a Malware Processing System

Acquisition

Storage

Processing

Indexing and Searching

Internet

Page 9: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

9Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Polling Question #1

How automated is your current malware collections and analysis processing?1) I don’t collect malware or perform malware analysis2) Manual analysis only; no queues or scheduling3) Partially automated collections and/or analysis, some manual effort4) Fully automated collections and analysis5) We outsource all malware analysis to another team or organization

Page 10: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

10Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Acquisition Methods

First, you need to determine how you will acquire malware samples:• Mine honeynets• Export AV quarantine• Host a submission form• Request email submissions• Malware hosting services

Service

Email

Honeypots

App

Quarantine

Page 11: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

11Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Policies Matter

Be mindful of relevant policies• Organizational policies

• Email services• Downloading files

• Hosting providers• Legal authorities

Page 12: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

12Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Site#3

Acquisition – Network and Storage Considerations

Scaling the network• Increase bandwidth at central

site• Multiple sites• Cloud services

Scaling storage• NAS/SAN• Distributed filestore• Cloud storage

Parallelize wherever possible to improve bandwidth utilization/throughput

Internet

Internet

Internet

Site#2Site#1

Page 13: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

13Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Machine Rooms

MachineRoom

Considerations

Floor Space

Page 14: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

14Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Consider User Access Methods

Direct Access• Users access database or

filestore directly• Possibly a variant of API

API• REST• JSON or similar format

Application• Command-line• Web UI

Application

API

Page 15: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

15Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Acquisition Metadata

Who?Who gave it? Who looked at it?

What?What is it?

When?When did we get it?

Where?service, website

How?email, web, service

How Many?Keep all? First from source? First?

If you don’t collect this information as part of the acquisition process, there is generally no way to re-acquire this metadata

Page 16: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

16Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Storage

Flat filesRDMS

• Postgresql• MySQL

NoSQL• CouchDB• MongoDB• Cassandra• Hadoop

Name files so they can be easy to locate and so different files won’t overwrite one another

Page 17: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

17Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Polling Question #2

How much malware does your organization collect (or soon hope to collect) each day?1) Less than 100 per day2) Up to 1000 per day3) Up to 10,000 per day4) Up to 100,000 per day5) Over 100,000 per day

Page 18: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

18Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Processing Malware

Surface Analysis

Reverse Engineering

Runtime Analysis

Unpack

Antivirus Scan

Static Analysis

Page 19: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

19Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Analysis Goals

Before building a system, determine what you hope to achieve with the system.• This is not an enterprise protection system• If you just want indicators, you don’t need to reverse engineer everything

in the code

Page 20: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

20Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

How Do We Get Here?

We can either begin our malware processing from the Acquisition step or the Storage step.Acquisition

• Push all new files onto a queue• Possibly the first thing the queue does is to insert the file into

datastoreStorage

• Store the data first• Query the datastore to determine “new” files

Page 21: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

21Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing - Surface Analysis

What information do you want from the files that does not require running them?• Filesize, filetype, mimetype• Hashes: md5, sha1, sha256,

ssdeep

SURFACE

ANALYSIS

?

d14deadbeef…e2f00blahblahblah

Page 22: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

22Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing – Antivirus Scan

Running each file through AV scanning can provide some information about the file• Use multiple AV engines• Names may not be accurate• Some files may not be found using AV

• Maybe false negative• Maybe the file isn’t malicious

• Beware of false positives

Page 23: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

23Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing – Runtime Analysis

Execute the malware in a virtual environment and see what it does:• Network access• File create, update, delete• Mutexes• Service start, stop• Memory dump• Screenshots• Often allowed to run for 5 minutes, additional post-processing time

neededCERT Anexa, Cuckoo Sandbox, Cisco AMP Threat Grid

Page 24: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

24Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Typical Runtime Analysis Results

How will users access this data?

How much extra storage is needed?

Should we also process the dropped files?

• Any output of a malware run may be malicious

• May cause deep recursion loops

Analysis report (XML or JSON)PCAP fileDropped FilesFiles created or altered

Indicators of CompromiseIP addresses, hostnames, mutexes, filenames, registry entries, service changes

Log files

Page 25: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

25Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing - Unpack

packer 1. Software that compresses other software. 2. A malicious tool that compresses and obfuscates software in order to defeat anti-reversing.* The MAL: A Malware Analysis Lexicon

If a malware sample is packed, then it needs to be unpacked to be able to reverse engineer or to obtain useful static analysis results.Non-malicious examples: gzip, pkzipCommon malware examples: UPX

Page 26: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

26Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Polling Question #3

How recent does malware analysis have to be in order to still be useful?1) I don’t have any use for malware analysis2) I need complete results in 30 minutes or less3) I need complete results in 48 hours or less4) As long as it’s done in a month or so, that’s OK with me5) It doesn’t matter how old the analysis is, I’ll still find a use for it

Page 27: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

27Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing – Static Analysis

Process file without executing it• Strings• Objdump• PE file sections

• Section Name• Section Size• Section Hash• Other filetypes (e.g., PDF, Office) have internal sections as well

• Extract functions

Page 28: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

28Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing – Reverse Engineering

Deconstruct malicious code to understand how the malware behaves on a system at a binary levelDissassemble and Debuggers• IDA Pro• OllyDbgMemory Dumper• OllyDumpEx• Volitility

Page 29: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

29Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing Concept – Batching Files

Some processes have large overhead for startup and/or teardown

• AV Scanning• Unpacking

Rather than processing single files, put a group together and process the group

• Must be able to extract individual results

• Entire group may fail due to one bad file

ANALYSIS

{…}

BatchedFiles

{…}

Page 30: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

30Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

MORGUEPulseless/Non-breathing

IMMEDIATELife threatening injury

DELAYEDSerious, Non Life threatening

MINORWalking wounded

Malware Processing Concept – Triage

Tri*age – verb – assign degrees of urgency to (wounded or ill patients)

Generally, this comes before time-consuming dynamic analysis• All malware is created differently,

some less different than others• For example, do we really need to

analyze every one of the 800,000+ “Allaple” samples we have?

• Run on all candidate samples to deprioritize (or skip entirely) known malware

Page 31: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

31Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Malware Processing Concept – Triage-2

What can be used for triage?• Results from static analysis• Searching for specific strings• Other information provided by

source or third partyTRIAGE

AV Results Strings Static Results

Filetype-specific

Malware Family-Specific Priority

Page 32: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

32Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Indexing and Searching

Specific file(s)Metadata (Acquisition, Surface Analysis)

• Few rows per record• Traditional RDBMS may be sufficient

Analysis Results• Possibly 1000s rows per record• Hadoop or other NoSQL may be better to handle variety of structures

Full-text search• Search possibly hundreds of terabytes• CERT BigGrep

Page 33: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

33Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Workflow

Surface Analysis

Reverse EngineeringRuntime Analysis

Unpack

Antivirus Scan

Static Analysis

Search System(s)

Triage

File Access

Page 34: Carnegie Mellon University Notice · Building and Scaling a Malware Analysis System 1 SEI Webinar © 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public

34Building and Scaling a Malware Analysis SystemSEI Webinar© 2016 Carnegie Mellon University, Distribution Statement A: Approved for Public Release; Distribution is Unlimited

Questions?