corsello re paper spring 2009

8/9/2019 Corsello RE Paper Spring 2009

1/16

Corsello Research Foundation

Software TamperingThe purpose, methods and potential safeguards to prevent the reverse

engineering of software

michael.corsello

1/24/2009


2/16

Abstract

Computer security is implemented at many levels; from the physical network, to the physical machines,

to software in any device. Today we place most of our emphasis on preventing malicious logic from ever

getting into a device where it can do harm. There is little effort in the protection of software and

systems from being directly hacked in the first place. Current operating system and software

architectures are extremely vulnerable to exploitation via the manipulation of executable code. One

main reason for the limited nature of actual exploits is the lack of understanding on how these exploits

can be performed.


3/16

Michael Corsello Term Paper CSci 287 Computer Network Defense

P a g e | 1 Software Tampering

Introduction

Software is arguably the most complex thing man has ever invented. Modern software applications can

be composed of many million lines of source code that are executing on processors running at several

gigahertz. This software performs the operations specified by the developers of the software, nothing

more, nothing less. Given this basic premise, it would seem that software should be perfectly safe in

that it should only be capable of operating as programmed. However, the underlying system, the

hardware and specifically the CPU only understands a primitive, basic set of instructions. This set of

instructions forms the instruction set architecture (ISA) of the CPU. These ISA instructions are quite

primitive operations such as add, subtract, multiple, divide, read, write, compare and jump. All

applications are developed as aggregations of this simpler form of instruction to form a set of

abstractions to perform what we know as an application.

Software applications are generally written today using general purpose programming languages that

are already highly abstracted from the underlying ISA of the machine. This abstraction provides a great

many benefits in that developers do not need to understand what the machine is actually doing at the

ISA level when they write this high-level code. Unfortunately, this also means that few developers ever

learn what a line of code in these high-level, general purpose languages actually is compiled into at that

lower level. This means that most developers will never understand what vulnerabilities they are

actually creating in their code.

Software Architecture

The architecture of a software application is based upon levels of architectures for the underlying

components an application will use. In this manner, any application is subject to any benefits and

limitations of the underlying architectures it will reside upon. At the lowest level, this is the ISA andoverall hardware architecture of the platform the software will run upon. This is largely static and will

not be addressed in this paper. Even so, there are many places within the hardware architectures of

both computers and networks that could be re-designed to enhance capabilities, performance and

security.

Operating Systems

Above the hardware, all software is hosted by an operating system that directly runs upon the hardware

platform. This operating system provides a hardware abstraction layer (HAL) and core software based

services that all applications need. This is generally in the form of libraries and a primary process that

can initiate other user mode application processes (our applications). The operating system abstractsthe interaction with hardware devices through the use of software drivers that the operating system

loads and manages. Interaction between the hardware devices and software (generally at the driver

level) is performed via interrupts that manage the synchronization of hardware operations and data

flows within the system.

The operating system kernel is the portion of the operating system that manages the memory and

interrupts and overall coordinates the operating of the system as a whole. The most important aspect


4/16



of the kernel with respect to applications is that the kernel initiates and manages the creation of

application processes and their memory allocations in coordination with the CPU.

Within an operating system, processes are started and managed to perform work. Each process can be

started on the behalf of a user (user mode) or some level of the operating system itself (system mode or

kernel mode). In general, the system mode processes can be divided into rings from level 0 to someupper bound level. The level 0 ring is the operating system kernel itself and must be the most secured

area from intercession. Any exploits at this ring can be completely catastrophic to the system as there is

no security at this level. In the higher level rings, a lower level of security is needed and therefore more

functional capabilities are granted at these levels. In general, drivers operate between ring 1 and ring 3

(each operating system is different and may have anywhere between 3 and 9 rings in total).

It is the operating system and these rings of trust that eventually open up into the user mode

applications. Any poorly written or vulnerable code at the lower numbered rings will affect every

application above that level even if it does not directly use the vulnerable code. It is for this very reason

that system mode code bases must be evaluated and should always be signed to prevent or at least limit

tampering.

Programming Languages and Libraries

The user mode applications we use to perform our work are still subject to any underlying vulnerabilities

in the operating system. Additionally, our applications generally use third party libraries that provide

some set of abstracted functionality. Each of these libraries may contain vulnerabilities that may be

exploited. Further, our applications written in a high-level language must be compiled into some

executable format that can be run within the operating system. This compilation process may produce

vulnerable code.

Each high-level language has a core set of keywords and operators that are recognized in textual form

that can then be mapped into a lower level set of instructions. In native code languages (such as

assembly, C, C++, etc) the source code is compiled directly into a machine language that can be run on

the host hardware and leverage the operating system provided services. In byte-code compiled

languages (such as Java, .NET, Python, etc), the source code is compiled into some intermediate format

that cannot be run directly on the hardware, but is instead dynamically compiled to native machine

code by a just-in-time (JIT) compiler or byte code interpreter. These forms of languages all provide a

form of protection to the underlying system in that their code cannot be run directly on the hardware

platform without the intercession of a virtual machine or interpreter. Due to the high-level nature and

inherent safety added in byte-code languages, the use of native code languages has been deemed by

many as dangerous and recommend that the only use of native code languages should be for lower

level rings, such as operating system and driver development.

Applications

Applications are in general loaded from some permanent storage device (i.e. disk). The access of a user

to the storage medium in which applications are held is the primary vehicle for tampering with

applications, both directly by users and by malicious applications that mutate the application executable


5/16



files. The code that is run on a computer is stored in a file with a specific format that the runtime (for

byte-code languages) or the operating system (native-code languages) can execute. As an example, the

Microsoft Windows operating system on the Intel based X86 32-bit hardware platform uses a portable

executable (PE) file format.

Portable Executable FormatThe PE file format is used for executables, object code, type libraries, dynamic link libraries (DLLs), static

libraries and drivers in both 32 and 64 bit versions of Microsoft Windows operating systems. The PE

format was derived from the UNIX COFF file format originally and is occasionally still known as the

PE/COFF format.

The file is laid out with multiple headers and sections used to define the memory mapping strategy

taken by the operating system when loading the library. The PE file in loaded into memory with static

offsets to code addresses based upon a relative base address. This makes the execution take fewer

dynamic address resolution steps, but also makes hacking easier as all function points are at fixed offsets

from this base address. For a DLL, the load process is based upon a preferred base address, which if

available, allows a single loading of the library to serve all processes using this library. If the preferred

address is unavailable, then the library is loaded to an available base address and then becomes process

specific and cannot be shared between processes. While this can increase overall memory efficiency

and make inter-process communication easier it also adds complexity to the overall architecture and

provides additional points for external attack.

For the newer dynamic runtime provided by .NET (and Mono under Linux), Microsoft wanted executable

compatibility. Therefore, all .NET executables are actually PE files with an additional common language

runtime (CLR) header and data sections. When run, the .NET PE file will bootstrap and hand over

execution to the CLR libraries which read these sections and transition to execution of the contained CLR

managed byte-code.

Due to the nature of any executable, there are sections of code that can be bypassed or altered without

compromising the overall functionality of the application. Details on these forms of manipulation

collectively become known as tampering and are generally performed via the process of reverse

engineering.

Details of the PE file format can be downloaded directly from several sites including directly from

Microsoft at:http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-

92cdfeae4b45/pecoff.doc

Reverse Engineering

Reverse engineering is the process of analyzing an existing product and working backwards to

determine how it functions, what components it is made of, and ultimately how to circumvent or

replicate the product in question. In software, reverse engineering comes in several forms and is a

specialized subset of the larger category of hacking. In many cases, the act of reverse engineering

software is actually legal and is protected in a limited scope under the fair use clause of U.S. Patent
http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.doc


6/16



law. Unfortunately, this provides enough freedom that the act of reverse engineering software to

create malicious products such as spyware, viruses, worms, etc; is generally legal. It is generally only the

use of such malicious software that is illegal.

Cracking

Cracking applications is one primary reverse engineering activity in software that intends to provide amechanism to bypass protection mechanisms. Examples of cracking would include altering installers to

not ask for a serial number / key code, altering software applications to bypass activation (primarily

Microsoft, such as in Office 2003/2007), removing trial software expiration and checking routines,

removing nag screens, etc.

The art of cracking software is quite widespread and many active crackerz (note the use of the z)

tend to take a great deal of pride in their ability to defeat any software protection / licensing schemes.

In many cases, a new software protection scheme will be cracked and an active patch will be posted to

file sharing sites within days of the protected software release. In the U.S. cracking is illegal based upon

the digital millennium copyright act (DMCA), which makes any attempt to circumvent any form ofintellectual property protection scheme illegal.

Tamper Resistance

The addition of mechanisms to a product to increase the difficulty of reverse engineering is known as

tamper resistance. In software, tamper resistance mechanisms come in many forms, the most common

of which is code obfuscators. Unfortunately, in compiled native code, these tools often only obfuscate

the source code and object code, but allow a disassembler to generate quite tractable assembly code

which can be tampered with and re-assembled into a new executable.

The discipline of trusted computing is the set of activities that provide protection from tampering with

software (and hardware) at various levels from high-level user mode application code (via mechanisms

such as digital signatures of code by third party code reviewers) to the operating system architecture

itself in limiting access to resources thus preventing the possibility of tampering with applications.

Mechanisms used in tamper resistance for software include:

Digital signatures (digests) of executable files

Increasing cyclomatic complexity of code

Obfuscation of source code

Dynamic code injection (polymorphic executables)

Installer encryption (executables stored as encrypted and decrypted into memory only)Address randomization (virtual addresses and lookups)

Disassembers

A primary means of tampering with an executable is to disassemble that executable into an assembly

language file. Assembly language is a low-level programming language that has a one-to-one


7/16



correspondence with the public ISA of the hardware architecture. Due to this correspondence, there is

no way to prevent the disassembly of an executable file if it is accessible.

There are several free and commercial disassembler applications available, the 2 used in this paper are:

OllyDbg (http://www.ollydbg.de/)

IdaPro (http://www.hex-rays.com/idapro/)

OllyDbg is a fully free application available for download, whereas IdaPro is a commercial product that

also has a publically free version.

Tampering With a PE File

In order to tamper with a PE file, the file is disassembled in a disassembler, in this case, OllyDbg. The

executable we are tampering with in this example is available at the Hacker Challenge website at:

http://www.dareyourmind.net/Challenge/Variable.rar. Once unpacked and executed, the program

presents a challenge to the user:

We are to crack this program and change the value of some variable vari to the integer value 1302.Once we believe we have succeeded in this task, we should be able to run the program and type check

and be greeted with success.

If we make no changes to the program and enter check, we see the following:

As a first action, we will open the program in OllyDbg and see the program as the CPU does. The

OllyDbg application shows all aspects of the program and allows us to step through the program
http://www.ollydbg.de/http://www.ollydbg.de/http://www.ollydbg.de/http://www.hex-rays.com/idapro/http://www.hex-rays.com/idapro/http://www.hex-rays.com/idapro/http://www.dareyourmind.net/Challenge/Variable.rarhttp://www.dareyourmind.net/Challenge/Variable.rarhttp://www.dareyourmind.net/Challenge/Variable.rarhttp://www.hex-rays.com/idapro/http://www.ollydbg.de/


8/16



statement by statement and see how the memory registers change and which line of assembly will be

executed next.

We can also examine the executable file itself in a hex format:


9/16



There are multiple means of solving this problem, we can search for the value presented (31333031),

which we will not find (it may be computed and never be mentioned in code), we can search for the

label vari (also we will not find), or we can search for the solution.

In this case, we know that there will be a check comparison that will verify our value is equal to 1302.

We can look for the message we see in the original program and backtrack from there:

Now, we can track to where this (004015D4) is called from:

We can now clearly see that we are close by noticing the ASCII You got it!! The password is text 2

lines below our call to the existing wrong answer.

Looking at the call to our output message, we can see the JNZ callwhich is a jump not equal to zero.Looking one line above this statement we see a CMP (compare) statement which compares the variable

at location 441000 to the static value 516 (HEX), which is our goal value of 1302 (DEC). Therefore, we

can crack this application in one of 2 ways, either change the static value to the current value (change

516 to 1DE1AA7 (HEX for 31333031) or change the value stored at 441000 to 516.

Looking at the current value in address 441000 we see:

Therefore, the current value of vari is actually 1301. So, we change this to 1302 and re-assemble this

file to an executable and run.


10/16



Then, after re-assembly and running, we see the same prompt:

But now when we enter check we see a new result:

This indicates that we successfully cracked this application.

General Concepts

When looking at a PE file there are several aspects that can be tampered with without impacting any

addresses. First, any string variables can be edited directly

if their lengths are not changed with zero impact of any

kind. Second, numeric constants can be edited directly if

their length is maintained with zero impact. Third,

embedded icons can be used as free space within the file.

In general, many PE files will have multiple icons embedded

within the file. The first icon can be used as free space to

place executable code into as long as it is less than the total

space of the icon. Then all references to the location of

that first icon can be re-mapped to a different icon within

the file with no impact to the PE. This is a means of causing

malicious code to be embedded in that icon area of the file

with a know address point (stays the same as for the icon). Then, the end of that malicious code can call

back to the origin point to continue execution.

Figure 1. Original Program Structure


11/16



Figure 2. Altered Program Structure

Example

The winmine.exe PE file is the Minesweeper game in Windows operating systems . Opening the PE in

OllyDbg, the text labels for menu items and related content are easy to see.

Making edits to these text entries within this PE file will be directly displayed in the user interface of the

application as shown below:


12/16



Further, the winmine.exe file contains several icons for use in the game. These icons can also be clearly

seen in OllyDbg:

Altering these icon areas can yield graphical updates to the application as shown below:


13/16



Or, more maliciously, some icons can be replaced with malicious code and simply be remapped to other

icons that are graphically similar.

Abstract Architecture For a Secure OS

A mechanism for creating a more secure operating system involves several aspects:

Increasing isolation from the hardware

Increasing isolation from the operating system

Reducing application access to resources

Reducing access to application files

Virtualization of resources

Use of virtual machines for all process hosting

High process isolation

Virtualization of operating environment

A basic architecture for accomplishing this is to build upon a primary concept within UNIX operating

systems already. There are resources that are unavailable to user mode operations. A single prime

example of this is the swap partition. The operating system handles all I/O directly with no application

access to the partition. It in general appears to not exist. This is the single same concept to secure an

operating system. Make all unneeded resources be invisible to the application.

MemoryIn this OS architecture, memory is abstracted to a pool much like in existing OSes. Each application gets

an isolated view of memory and therefore owns that memory image. However, this memory image is

fragmented into frames that each have base addresses that cannot be spanned. In this manner, OS level

resources and shared libraries are loaded into fixed, read-only frames that are thereby safe for use.

Note that these are read-only, and therefore do not contain application editable data. In the context of

.NET, this is similar to the use of Application Domains.


14/16



Disk

The security of mass storage devices will continue to be an issue even in this architecture. The primary

additional security in this architecture is the isolation of executable code from the user mode accessible

storage devices. In this architecture, the physical non-removable storage is virtualized to slices that

are characterized by basic performance metrics indicating relative performance for read, write,

overwrite operations of the underlying physical hardware. A virtual slice may span any number of

physical devices in a manner similar to that of current RAID solutions. The underlying physical media

may still be on a RAID array or SAN/NAS devices.

The storage slices are used to form mount points that are of one of the following classes:

System (Swap and OS are such volumes)

Protected (Drivers and applications are on such a volume)

Limited (Configuration data is on such a volume)

User (Data is such a volume)

A system volume is completely invisible to the system outside of the ring 0 operating system code base.

The volume is fully accessible to the OS alone.

A protected volume is read-only accessible at a virtual mount point. A protected volume is essentially

hierarchical with the root level of access being defined by the owner of the virtual root. For example

company ABC would have a virtual ABC root that is defined securely. Within that root, all programs

produced by company ABC would reside. ABC has the sole control over the structure, content and

accessibility within that virtual root. All files are read-only except by the OS which performs single

writes (installs) and single deletes (uninstalls) to this volume.

The limited volumes are read/write accessible and built upon the same structure as the protectedvolume. This is the primary type of volume that a program may store configuration parameters for use

by the application. Also, each user has a mirrored structure for user-level profile parameters. This

volume is read/write to the program that owns the root and is not visible in user mode to any user.

Overall, the system, protected and limited class volumes within a system are not visible to the user at all

in any form or fashion. For a user to be able to execute an application that resides in these volumes, the

application is registered with the OS and placed into the volume by the OS. The OS exposes a virtual link

that permits the executable to be run, but it is only runnable and in no way readable as a file.

The final class of volume, the user volume is a semi-traditional full access user mode storage location. In

this manner, downloaded software cannot be run until it is installed into the system. Since the volume

is data only, there are no files in these volumes that the OS will recognize as runnable. Therefore, any

tampering to files within the user volumes will not affect the applications in a traditional sense (scripting

or macro files that are interpreted from a data file by an installed program is a vulnerable vector).

The primary difference between the user volume in this OS architecture and a traditional data volume is

that the user volume is further sliced by user to be virtualized as multiple overlapping volumes. Each


15/16



user gets a custom perspective view of the logically virtualized volumes based upon their access rights.

The security in this volume is a combination of role-based (RBAC), policy-based (PBAC), mandatory

(MAC) and user-discretionary controls. The algorithms used to dynamically merge the volumes for

performance is a key distinction from existing OSes.

DevicesAll hardware devices that are installed on a system must be made available to users. The control of

these devices, their drivers and access constraints are a key area for both security and system attack.

This OS architecture will protect against external vectors via a combination of device locking (at the

kernel level) and disk access at the disk level (preventing driver installation) that controls the addition of

new devices. Additionally, keyed devices can be optionally allowed or disallowed by user, by device type

and by device id via a security device labeling engine (a form of MAC). Via the disk access controls,

drivers all require a minimum soft restart of the OS to be installed.

Applications

All software must be installed into the protected and limited volumes (executable/libraries andconfigurations respectively) to be made available to users. The installation process for all programs and

drivers will require a physical acknowledgement from the user with appropriate access to execute an

installation package. The installer itself only places the content in a virtual volume to be copied and

installed upon restart of the OS. In this manner, the installation process is 2 phase and requires full

acknowledgement. Upon restart, the OS identifies this virtual volume and installs the verified content to

the secured volumes and makes the application available to the users. An OS soft restart does not

require a full system (hardware) reboot, only an environment unload and reload much like a current

virtual machine restart on virtualized systems.

This installation process permits several actions during installation:

Each file can be checked against a digital signature and verified online if OS policy is enabled to

do so

Files are copied into virtual space where they can be scanned and checked by a registered

security application granted access to do so

Multi-stage process thwarts many malicious code practices by not permitting automatic start of

the malicious code


16/16



Conclusions

Current software architectures are inherently insecure by the nature of the applications themselves and

their underlying system architectures. To build truly secure applications that perform well a new

approach must be taken. The operating system itself is a key target for redesign that can ameliorate

many problems including network level exploitations by reducing runtime access to resources.

References

Bishop, M. (2003). Computer Security. Boston: Addison Wesley.

Wikipedia contributors. (2009, April 4). Operating system. Retrieved April 4, 2009, from Wikipedia, TheFree Encyclopedia: http://en.wikipedia.org/w/index.php?title=Operating_system&oldid=281660249

Wikipedia contributors. (2009, April 3). Portable Executable. Retrieved April 4, 2009, from Wikipedia,

The Free Encyclopedia:

http://en.wikipedia.org/w/index.php?title=Portable_Executable&oldid=281445980

Wikipedia contributors. (2009, April 1). Reverse engineering. Retrieved April 4, 2009, from Wikipedia,


http://en.wikipedia.org/w/index.php?title=Reverse_engineering&oldid=281088125

Wikipedia contributors. (2009, March 25). Software cracking. Retrieved April 4, 2009, from Wikipedia,The Free Encyclopedia: http://en.wikipedia.org/w/index.php?title=Software_cracking&oldid=279631247

Wikipedia contributors. (2009, March 11). Tamper resistance. Retrieved April 4, 2009, from Wikipedia,


http://en.wikipedia.org/w/index.php?title=Tamper_resistance&oldid=276490869

corsello re paper spring 2009

Documents