corsello re paper spring 2009

Upload: michael-corsello

Post on 30-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Corsello RE Paper Spring 2009

    1/16

    Corsello Research Foundation

    Software TamperingThe purpose, methods and potential safeguards to prevent the reverse

    engineering of software

    michael.corsello

    1/24/2009

  • 8/9/2019 Corsello RE Paper Spring 2009

    2/16

    Abstract

    Computer security is implemented at many levels; from the physical network, to the physical machines,

    to software in any device. Today we place most of our emphasis on preventing malicious logic from ever

    getting into a device where it can do harm. There is little effort in the protection of software and

    systems from being directly hacked in the first place. Current operating system and software

    architectures are extremely vulnerable to exploitation via the manipulation of executable code. One

    main reason for the limited nature of actual exploits is the lack of understanding on how these exploits

    can be performed.

  • 8/9/2019 Corsello RE Paper Spring 2009

    3/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 1 Software Tampering

    Introduction

    Software is arguably the most complex thing man has ever invented. Modern software applications can

    be composed of many million lines of source code that are executing on processors running at several

    gigahertz. This software performs the operations specified by the developers of the software, nothing

    more, nothing less. Given this basic premise, it would seem that software should be perfectly safe in

    that it should only be capable of operating as programmed. However, the underlying system, the

    hardware and specifically the CPU only understands a primitive, basic set of instructions. This set of

    instructions forms the instruction set architecture (ISA) of the CPU. These ISA instructions are quite

    primitive operations such as add, subtract, multiple, divide, read, write, compare and jump. All

    applications are developed as aggregations of this simpler form of instruction to form a set of

    abstractions to perform what we know as an application.

    Software applications are generally written today using general purpose programming languages that

    are already highly abstracted from the underlying ISA of the machine. This abstraction provides a great

    many benefits in that developers do not need to understand what the machine is actually doing at the

    ISA level when they write this high-level code. Unfortunately, this also means that few developers ever

    learn what a line of code in these high-level, general purpose languages actually is compiled into at that

    lower level. This means that most developers will never understand what vulnerabilities they are

    actually creating in their code.

    Software Architecture

    The architecture of a software application is based upon levels of architectures for the underlying

    components an application will use. In this manner, any application is subject to any benefits and

    limitations of the underlying architectures it will reside upon. At the lowest level, this is the ISA andoverall hardware architecture of the platform the software will run upon. This is largely static and will

    not be addressed in this paper. Even so, there are many places within the hardware architectures of

    both computers and networks that could be re-designed to enhance capabilities, performance and

    security.

    Operating Systems

    Above the hardware, all software is hosted by an operating system that directly runs upon the hardware

    platform. This operating system provides a hardware abstraction layer (HAL) and core software based

    services that all applications need. This is generally in the form of libraries and a primary process that

    can initiate other user mode application processes (our applications). The operating system abstractsthe interaction with hardware devices through the use of software drivers that the operating system

    loads and manages. Interaction between the hardware devices and software (generally at the driver

    level) is performed via interrupts that manage the synchronization of hardware operations and data

    flows within the system.

    The operating system kernel is the portion of the operating system that manages the memory and

    interrupts and overall coordinates the operating of the system as a whole. The most important aspect

  • 8/9/2019 Corsello RE Paper Spring 2009

    4/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 2 Software Tampering

    of the kernel with respect to applications is that the kernel initiates and manages the creation of

    application processes and their memory allocations in coordination with the CPU.

    Within an operating system, processes are started and managed to perform work. Each process can be

    started on the behalf of a user (user mode) or some level of the operating system itself (system mode or

    kernel mode). In general, the system mode processes can be divided into rings from level 0 to someupper bound level. The level 0 ring is the operating system kernel itself and must be the most secured

    area from intercession. Any exploits at this ring can be completely catastrophic to the system as there is

    no security at this level. In the higher level rings, a lower level of security is needed and therefore more

    functional capabilities are granted at these levels. In general, drivers operate between ring 1 and ring 3

    (each operating system is different and may have anywhere between 3 and 9 rings in total).

    It is the operating system and these rings of trust that eventually open up into the user mode

    applications. Any poorly written or vulnerable code at the lower numbered rings will affect every

    application above that level even if it does not directly use the vulnerable code. It is for this very reason

    that system mode code bases must be evaluated and should always be signed to prevent or at least limit

    tampering.

    Programming Languages and Libraries

    The user mode applications we use to perform our work are still subject to any underlying vulnerabilities

    in the operating system. Additionally, our applications generally use third party libraries that provide

    some set of abstracted functionality. Each of these libraries may contain vulnerabilities that may be

    exploited. Further, our applications written in a high-level language must be compiled into some

    executable format that can be run within the operating system. This compilation process may produce

    vulnerable code.

    Each high-level language has a core set of keywords and operators that are recognized in textual form

    that can then be mapped into a lower level set of instructions. In native code languages (such as

    assembly, C, C++, etc) the source code is compiled directly into a machine language that can be run on

    the host hardware and leverage the operating system provided services. In byte-code compiled

    languages (such as Java, .NET, Python, etc), the source code is compiled into some intermediate format

    that cannot be run directly on the hardware, but is instead dynamically compiled to native machine

    code by a just-in-time (JIT) compiler or byte code interpreter. These forms of languages all provide a

    form of protection to the underlying system in that their code cannot be run directly on the hardware

    platform without the intercession of a virtual machine or interpreter. Due to the high-level nature and

    inherent safety added in byte-code languages, the use of native code languages has been deemed by

    many as dangerous and recommend that the only use of native code languages should be for lower

    level rings, such as operating system and driver development.

    Applications

    Applications are in general loaded from some permanent storage device (i.e. disk). The access of a user

    to the storage medium in which applications are held is the primary vehicle for tampering with

    applications, both directly by users and by malicious applications that mutate the application executable

  • 8/9/2019 Corsello RE Paper Spring 2009

    5/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 3 Software Tampering

    files. The code that is run on a computer is stored in a file with a specific format that the runtime (for

    byte-code languages) or the operating system (native-code languages) can execute. As an example, the

    Microsoft Windows operating system on the Intel based X86 32-bit hardware platform uses a portable

    executable (PE) file format.

    Portable Executable FormatThe PE file format is used for executables, object code, type libraries, dynamic link libraries (DLLs), static

    libraries and drivers in both 32 and 64 bit versions of Microsoft Windows operating systems. The PE

    format was derived from the UNIX COFF file format originally and is occasionally still known as the

    PE/COFF format.

    The file is laid out with multiple headers and sections used to define the memory mapping strategy

    taken by the operating system when loading the library. The PE file in loaded into memory with static

    offsets to code addresses based upon a relative base address. This makes the execution take fewer

    dynamic address resolution steps, but also makes hacking easier as all function points are at fixed offsets

    from this base address. For a DLL, the load process is based upon a preferred base address, which if

    available, allows a single loading of the library to serve all processes using this library. If the preferred

    address is unavailable, then the library is loaded to an available base address and then becomes process

    specific and cannot be shared between processes. While this can increase overall memory efficiency

    and make inter-process communication easier it also adds complexity to the overall architecture and

    provides additional points for external attack.

    For the newer dynamic runtime provided by .NET (and Mono under Linux), Microsoft wanted executable

    compatibility. Therefore, all .NET executables are actually PE files with an additional common language

    runtime (CLR) header and data sections. When run, the .NET PE file will bootstrap and hand over

    execution to the CLR libraries which read these sections and transition to execution of the contained CLR

    managed byte-code.

    Due to the nature of any executable, there are sections of code that can be bypassed or altered without

    compromising the overall functionality of the application. Details on these forms of manipulation

    collectively become known as tampering and are generally performed via the process of reverse

    engineering.

    Details of the PE file format can be downloaded directly from several sites including directly from

    Microsoft at:http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-

    92cdfeae4b45/pecoff.doc

    Reverse Engineering

    Reverse engineering is the process of analyzing an existing product and working backwards to

    determine how it functions, what components it is made of, and ultimately how to circumvent or

    replicate the product in question. In software, reverse engineering comes in several forms and is a

    specialized subset of the larger category of hacking. In many cases, the act of reverse engineering

    software is actually legal and is protected in a limited scope under the fair use clause of U.S. Patent

    http://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.dochttp://download.microsoft.com/download/e/b/a/eba1050f-a31d-436b-9281-92cdfeae4b45/pecoff.doc
  • 8/9/2019 Corsello RE Paper Spring 2009

    6/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 4 Software Tampering

    law. Unfortunately, this provides enough freedom that the act of reverse engineering software to

    create malicious products such as spyware, viruses, worms, etc; is generally legal. It is generally only the

    use of such malicious software that is illegal.

    Cracking

    Cracking applications is one primary reverse engineering activity in software that intends to provide amechanism to bypass protection mechanisms. Examples of cracking would include altering installers to

    not ask for a serial number / key code, altering software applications to bypass activation (primarily

    Microsoft, such as in Office 2003/2007), removing trial software expiration and checking routines,

    removing nag screens, etc.

    The art of cracking software is quite widespread and many active crackerz (note the use of the z)

    tend to take a great deal of pride in their ability to defeat any software protection / licensing schemes.

    In many cases, a new software protection scheme will be cracked and an active patch will be posted to

    file sharing sites within days of the protected software release. In the U.S. cracking is illegal based upon

    the digital millennium copyright act (DMCA), which makes any attempt to circumvent any form ofintellectual property protection scheme illegal.

    Tamper Resistance

    The addition of mechanisms to a product to increase the difficulty of reverse engineering is known as

    tamper resistance. In software, tamper resistance mechanisms come in many forms, the most common

    of which is code obfuscators. Unfortunately, in compiled native code, these tools often only obfuscate

    the source code and object code, but allow a disassembler to generate quite tractable assembly code

    which can be tampered with and re-assembled into a new executable.

    The discipline of trusted computing is the set of activities that provide protection from tampering with

    software (and hardware) at various levels from high-level user mode application code (via mechanisms

    such as digital signatures of code by third party code reviewers) to the operating system architecture

    itself in limiting access to resources thus preventing the possibility of tampering with applications.

    Mechanisms used in tamper resistance for software include:

    Digital signatures (digests) of executable files

    Increasing cyclomatic complexity of code

    Obfuscation of source code

    Dynamic code injection (polymorphic executables)

    Installer encryption (executables stored as encrypted and decrypted into memory only)Address randomization (virtual addresses and lookups)

    Disassembers

    A primary means of tampering with an executable is to disassemble that executable into an assembly

    language file. Assembly language is a low-level programming language that has a one-to-one

  • 8/9/2019 Corsello RE Paper Spring 2009

    7/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 5 Software Tampering

    correspondence with the public ISA of the hardware architecture. Due to this correspondence, there is

    no way to prevent the disassembly of an executable file if it is accessible.

    There are several free and commercial disassembler applications available, the 2 used in this paper are:

    OllyDbg (http://www.ollydbg.de/)

    IdaPro (http://www.hex-rays.com/idapro/)

    OllyDbg is a fully free application available for download, whereas IdaPro is a commercial product that

    also has a publically free version.

    Tampering With a PE File

    In order to tamper with a PE file, the file is disassembled in a disassembler, in this case, OllyDbg. The

    executable we are tampering with in this example is available at the Hacker Challenge website at:

    http://www.dareyourmind.net/Challenge/Variable.rar. Once unpacked and executed, the program

    presents a challenge to the user:

    We are to crack this program and change the value of some variable vari to the integer value 1302.Once we believe we have succeeded in this task, we should be able to run the program and type check

    and be greeted with success.

    If we make no changes to the program and enter check, we see the following:

    As a first action, we will open the program in OllyDbg and see the program as the CPU does. The

    OllyDbg application shows all aspects of the program and allows us to step through the program

    http://www.ollydbg.de/http://www.ollydbg.de/http://www.ollydbg.de/http://www.hex-rays.com/idapro/http://www.hex-rays.com/idapro/http://www.hex-rays.com/idapro/http://www.dareyourmind.net/Challenge/Variable.rarhttp://www.dareyourmind.net/Challenge/Variable.rarhttp://www.dareyourmind.net/Challenge/Variable.rarhttp://www.hex-rays.com/idapro/http://www.ollydbg.de/
  • 8/9/2019 Corsello RE Paper Spring 2009

    8/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 6 Software Tampering

    statement by statement and see how the memory registers change and which line of assembly will be

    executed next.

    We can also examine the executable file itself in a hex format:

  • 8/9/2019 Corsello RE Paper Spring 2009

    9/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 7 Software Tampering

    There are multiple means of solving this problem, we can search for the value presented (31333031),

    which we will not find (it may be computed and never be mentioned in code), we can search for the

    label vari (also we will not find), or we can search for the solution.

    In this case, we know that there will be a check comparison that will verify our value is equal to 1302.

    We can look for the message we see in the original program and backtrack from there:

    Now, we can track to where this (004015D4) is called from:

    We can now clearly see that we are close by noticing the ASCII You got it!! The password is text 2

    lines below our call to the existing wrong answer.

    Looking at the call to our output message, we can see the JNZ callwhich is a jump not equal to zero.Looking one line above this statement we see a CMP (compare) statement which compares the variable

    at location 441000 to the static value 516 (HEX), which is our goal value of 1302 (DEC). Therefore, we

    can crack this application in one of 2 ways, either change the static value to the current value (change

    516 to 1DE1AA7 (HEX for 31333031) or change the value stored at 441000 to 516.

    Looking at the current value in address 441000 we see:

    Therefore, the current value of vari is actually 1301. So, we change this to 1302 and re-assemble this

    file to an executable and run.

  • 8/9/2019 Corsello RE Paper Spring 2009

    10/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 8 Software Tampering

    Then, after re-assembly and running, we see the same prompt:

    But now when we enter check we see a new result:

    This indicates that we successfully cracked this application.

    General Concepts

    When looking at a PE file there are several aspects that can be tampered with without impacting any

    addresses. First, any string variables can be edited directly

    if their lengths are not changed with zero impact of any

    kind. Second, numeric constants can be edited directly if

    their length is maintained with zero impact. Third,

    embedded icons can be used as free space within the file.

    In general, many PE files will have multiple icons embedded

    within the file. The first icon can be used as free space to

    place executable code into as long as it is less than the total

    space of the icon. Then all references to the location of

    that first icon can be re-mapped to a different icon within

    the file with no impact to the PE. This is a means of causing

    malicious code to be embedded in that icon area of the file

    with a know address point (stays the same as for the icon). Then, the end of that malicious code can call

    back to the origin point to continue execution.

    Figure 1. Original Program Structure

  • 8/9/2019 Corsello RE Paper Spring 2009

    11/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 9 Software Tampering

    Figure 2. Altered Program Structure

    Example

    The winmine.exe PE file is the Minesweeper game in Windows operating systems . Opening the PE in

    OllyDbg, the text labels for menu items and related content are easy to see.

    Making edits to these text entries within this PE file will be directly displayed in the user interface of the

    application as shown below:

  • 8/9/2019 Corsello RE Paper Spring 2009

    12/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 10 Software Tampering

    Further, the winmine.exe file contains several icons for use in the game. These icons can also be clearly

    seen in OllyDbg:

    Altering these icon areas can yield graphical updates to the application as shown below:

  • 8/9/2019 Corsello RE Paper Spring 2009

    13/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 11 Software Tampering

    Or, more maliciously, some icons can be replaced with malicious code and simply be remapped to other

    icons that are graphically similar.

    Abstract Architecture For a Secure OS

    A mechanism for creating a more secure operating system involves several aspects:

    Increasing isolation from the hardware

    Increasing isolation from the operating system

    Reducing application access to resources

    Reducing access to application files

    Virtualization of resources

    Use of virtual machines for all process hosting

    High process isolation

    Virtualization of operating environment

    A basic architecture for accomplishing this is to build upon a primary concept within UNIX operating

    systems already. There are resources that are unavailable to user mode operations. A single prime

    example of this is the swap partition. The operating system handles all I/O directly with no application

    access to the partition. It in general appears to not exist. This is the single same concept to secure an

    operating system. Make all unneeded resources be invisible to the application.

    MemoryIn this OS architecture, memory is abstracted to a pool much like in existing OSes. Each application gets

    an isolated view of memory and therefore owns that memory image. However, this memory image is

    fragmented into frames that each have base addresses that cannot be spanned. In this manner, OS level

    resources and shared libraries are loaded into fixed, read-only frames that are thereby safe for use.

    Note that these are read-only, and therefore do not contain application editable data. In the context of

    .NET, this is similar to the use of Application Domains.

  • 8/9/2019 Corsello RE Paper Spring 2009

    14/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 12 Software Tampering

    Disk

    The security of mass storage devices will continue to be an issue even in this architecture. The primary

    additional security in this architecture is the isolation of executable code from the user mode accessible

    storage devices. In this architecture, the physical non-removable storage is virtualized to slices that

    are characterized by basic performance metrics indicating relative performance for read, write,

    overwrite operations of the underlying physical hardware. A virtual slice may span any number of

    physical devices in a manner similar to that of current RAID solutions. The underlying physical media

    may still be on a RAID array or SAN/NAS devices.

    The storage slices are used to form mount points that are of one of the following classes:

    System (Swap and OS are such volumes)

    Protected (Drivers and applications are on such a volume)

    Limited (Configuration data is on such a volume)

    User (Data is such a volume)

    A system volume is completely invisible to the system outside of the ring 0 operating system code base.

    The volume is fully accessible to the OS alone.

    A protected volume is read-only accessible at a virtual mount point. A protected volume is essentially

    hierarchical with the root level of access being defined by the owner of the virtual root. For example

    company ABC would have a virtual ABC root that is defined securely. Within that root, all programs

    produced by company ABC would reside. ABC has the sole control over the structure, content and

    accessibility within that virtual root. All files are read-only except by the OS which performs single

    writes (installs) and single deletes (uninstalls) to this volume.

    The limited volumes are read/write accessible and built upon the same structure as the protectedvolume. This is the primary type of volume that a program may store configuration parameters for use

    by the application. Also, each user has a mirrored structure for user-level profile parameters. This

    volume is read/write to the program that owns the root and is not visible in user mode to any user.

    Overall, the system, protected and limited class volumes within a system are not visible to the user at all

    in any form or fashion. For a user to be able to execute an application that resides in these volumes, the

    application is registered with the OS and placed into the volume by the OS. The OS exposes a virtual link

    that permits the executable to be run, but it is only runnable and in no way readable as a file.

    The final class of volume, the user volume is a semi-traditional full access user mode storage location. In

    this manner, downloaded software cannot be run until it is installed into the system. Since the volume

    is data only, there are no files in these volumes that the OS will recognize as runnable. Therefore, any

    tampering to files within the user volumes will not affect the applications in a traditional sense (scripting

    or macro files that are interpreted from a data file by an installed program is a vulnerable vector).

    The primary difference between the user volume in this OS architecture and a traditional data volume is

    that the user volume is further sliced by user to be virtualized as multiple overlapping volumes. Each

  • 8/9/2019 Corsello RE Paper Spring 2009

    15/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 13 Software Tampering

    user gets a custom perspective view of the logically virtualized volumes based upon their access rights.

    The security in this volume is a combination of role-based (RBAC), policy-based (PBAC), mandatory

    (MAC) and user-discretionary controls. The algorithms used to dynamically merge the volumes for

    performance is a key distinction from existing OSes.

    DevicesAll hardware devices that are installed on a system must be made available to users. The control of

    these devices, their drivers and access constraints are a key area for both security and system attack.

    This OS architecture will protect against external vectors via a combination of device locking (at the

    kernel level) and disk access at the disk level (preventing driver installation) that controls the addition of

    new devices. Additionally, keyed devices can be optionally allowed or disallowed by user, by device type

    and by device id via a security device labeling engine (a form of MAC). Via the disk access controls,

    drivers all require a minimum soft restart of the OS to be installed.

    Applications

    All software must be installed into the protected and limited volumes (executable/libraries andconfigurations respectively) to be made available to users. The installation process for all programs and

    drivers will require a physical acknowledgement from the user with appropriate access to execute an

    installation package. The installer itself only places the content in a virtual volume to be copied and

    installed upon restart of the OS. In this manner, the installation process is 2 phase and requires full

    acknowledgement. Upon restart, the OS identifies this virtual volume and installs the verified content to

    the secured volumes and makes the application available to the users. An OS soft restart does not

    require a full system (hardware) reboot, only an environment unload and reload much like a current

    virtual machine restart on virtualized systems.

    This installation process permits several actions during installation:

    Each file can be checked against a digital signature and verified online if OS policy is enabled to

    do so

    Files are copied into virtual space where they can be scanned and checked by a registered

    security application granted access to do so

    Multi-stage process thwarts many malicious code practices by not permitting automatic start of

    the malicious code

  • 8/9/2019 Corsello RE Paper Spring 2009

    16/16

    Michael Corsello Term Paper CSci 287 Computer Network Defense

    P a g e | 14 Software Tampering

    Conclusions

    Current software architectures are inherently insecure by the nature of the applications themselves and

    their underlying system architectures. To build truly secure applications that perform well a new

    approach must be taken. The operating system itself is a key target for redesign that can ameliorate

    many problems including network level exploitations by reducing runtime access to resources.

    References

    Bishop, M. (2003). Computer Security. Boston: Addison Wesley.

    Wikipedia contributors. (2009, April 4). Operating system. Retrieved April 4, 2009, from Wikipedia, TheFree Encyclopedia: http://en.wikipedia.org/w/index.php?title=Operating_system&oldid=281660249

    Wikipedia contributors. (2009, April 3). Portable Executable. Retrieved April 4, 2009, from Wikipedia,

    The Free Encyclopedia:

    http://en.wikipedia.org/w/index.php?title=Portable_Executable&oldid=281445980

    Wikipedia contributors. (2009, April 1). Reverse engineering. Retrieved April 4, 2009, from Wikipedia,

    The Free Encyclopedia:

    http://en.wikipedia.org/w/index.php?title=Reverse_engineering&oldid=281088125

    Wikipedia contributors. (2009, March 25). Software cracking. Retrieved April 4, 2009, from Wikipedia,The Free Encyclopedia: http://en.wikipedia.org/w/index.php?title=Software_cracking&oldid=279631247

    Wikipedia contributors. (2009, March 11). Tamper resistance. Retrieved April 4, 2009, from Wikipedia,

    The Free Encyclopedia:

    http://en.wikipedia.org/w/index.php?title=Tamper_resistance&oldid=276490869