digital evidence concepts some text is drawn from forensics on the windows platform, part two

51
Digital Evidence Concepts Some text is drawn from Forensics on the Windows Platform, Part Two http://www.securityfocus.com/infocus/1665

Upload: derick-dean

Post on 25-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Digital Evidence Concepts

Some text is drawn from Forensics on the Windows Platform, Part Twohttp://www.securityfocus.com/infocus/1665

Learning Objectives

• In this module we will discuss:– Computer foundations– Number systems– Data storage– Hard disk structure and addressing– Data acquisition

Forensic examination

• While there are many devices from which electronic evidence may be extracted we focus here on the computer hard disk

Imaging

• The first step in the forensic examination of a computer hard drive is the creation of a "bit level" copy, or image, which includes all the information on the disk regardless of whether or not it is part of an existing file system.

• The creation of this image serves to provide a platform that can be subjected to in-depth analysis without fear of altering the original evidence.

• A number of tools are commonly used by investigators to perform forensic imaging

Tools - Encase

• EnCase: EnCase is a fully featured commercial software package that enables an investigator to image and examine data from hard disks, removable media, and some PDAs. This image can be analyzed in a variety of ways using the EnCase program, common examples of which might include searching the data for keywords, viewing picture files, or examining deleted files. Many law enforcement groups throughout the world use EnCase; this may be an important factor for investigators to consider if there is a possibility that an investigation may be handed over to the police or used in a court of law.

Tools -dd

• Data dumper (dd): Imaging a computer's hard disk can be a lengthy process. dd is a freely available utility for Unix systems that can make exact copies of disks that are suitable for forensic analysis. It is a command line tool and requires a sound knowledge of the command syntax to be used properly. Modified versions of dd intended specifically for use as a forensic utility are also available. Windows versions dd.exe are also available

Tools – Prodiscover Basic, Helix etc

• Prodiscover basic - in a real life investigation you would use this to make your disc image.

• Helix– this has a huge variety of Windows and Unix based forensic tools

• Other pointers in the assignment specification to be released (Assignment 2)

Tools – hex editor

• We can also use a tool called a hex editor, such as Winhex, to create images and for investigations.

• In real life, and in a case where there was any doubt as to the nature of the evidence, several different tools might be used to investigate.

• Next we cover some more foundational issues

Hashing an image

• Once an image has been made, we and the court need to know that it was made correctly (no inadvertent or deliberate falsification)

• How can we be sure that the copy is exactly the same as the original?

• The answer lies with the algorithm called MD5. This procedure results in the creation of a large number called a "message digest", the exact value of which is determined by the layout of data found on a disk (MD5 can also be used to create message digests for files).

• Crucially, if the disk contents were to be altered in any way, through deleting or changing a file for example, running the MD5 algorithm would result in a radically different message digest.

• This is true regardless of the extent of the alterations made; even a change to one bit of information on a large drive packed with data would result in a new message digest.

md5sum

• md5sum is a freely available utility for creating MD5 message digests and by comparing message digests of original disks and copies thereof, can be used in computer forensic examinations to ensure that an image made is an exact replica of the original.

• MD5 is a cryptographic hash with the “avalanche criterion” so that small changes in input result in large changes in the output

• Also highly “collision-resistant” so that it is computationally infeasible to discover another file with the same hash (and hence be able to substitute it)

• BUT probabilistic, not deterministic (pigeon hole principle)• Other cryptographic hashes include SHA0 and SHA1

Search

• Once an image has been made, the task of searching for evidence can begin.

• Most commercial imaging tools that provide imaging capabilities also provide comprehensive image analysis facilities;

• There are freely available open source alternatives, Task and Autopsy are recommended.

• Pyflag is another tool – an Australian one• These should come with Helix

Where do we look for evidence?

• When a user logs on to a Windows system for the first time a whole directory structure is created to hold that individual user's files and settings.

• This structure has a root directory that is given the same name as the username that was used to log on (which in itself can be useful forensic evidence) and contains a number of folders and files of interest to the forensic investigator.

• For example, the file NTUSER.DAT (which holds configuration information specific to the user and is located in the root of the user's directory structure) is updated when the user logs out, thus enabling an investigator to pinpoint this logout time by examining the file's "last written" attribute.

Cookies

• The Cookies folder, which is used to hold data files stored by Internet sites that have been visited by the user, is yet another potential source of information for the investigator.

• Together with the temporary Internet history files described below, a fairly detailed picture of a user's Web surfing activities can usually be developed.

• A number of small utilities are available that can help display the contents of a cookie in an easily readable form; one to try is CookieView – this (or very similar) is on Helix

• Even browsers store cookie files, often indefinitely

• Browsers also store a quantity of other user-specific data which many users never clear

Windows artefacts

• Files created by Windows operating systems to facilitate quick access to applications, files or devices or certain files created by the operating system for a range of other purposes are often termed windows artefacts

• These can be another important source of information for investigators attempting to recreate a user's activities (partly because they are often overlooked by those attempting to cover their own tracks).

• Common examples are the .lnk files created in the Windows Desktop, Recent, Send To and Start Menu folders. These files act as handy shortcuts that enable the user to access frequently used files or applications with a minimum of effort.

Deleted Files

• Many computer users believe that deleting a file, from within Windows Explorer for example, is enough to prevent that file from being accessed by others at a later date (especially if the "Recycle Bin" is also emptied).

• Fortunately for forensics investigators, the act of deleting a file in this fashion can still leave the data open to recovery.

• This is because when a file is deleted, the data itself is not removed from the system. Instead, the operating system simply marks the file as deleted and the area of the disk occupied by the file becomes available for storing other data.

• Until this area is overwritten, the data belonging to the deleted file remains on the disk. Using the appropriate forensic tools and methodology, this data can be recovered.

• Even in cases where the area of disk in question has been used to store data from another file, if the area occupied by the original file has not been completely overwritten it may still be possible to recover part of the deleted file.

The Recycle Bin

• The Recycle Bin acts as a kind of halfway house for user-deleted files from which they may be undeleted by the user if required.

• The Recycle Bin is of great interest to forensic investigators because of a special file called INFO, or INFO2 on Windows 98 systems, which is used by the operating system to record details of files moved into the Recycle Bin.

• Amongst other details, the original location of files before they were deleted and the date and time of deletion are recorded in this file.

• When the Recycle Bin is emptied, this file is deleted along with the other files but, in exactly the same way as described above, it may still be possible to recover the file's contents if they have not been overwritten.

Other basics

• Numbers, hardware, imaging and verifying• Ack: Brian Carrier’s File System Forensic Analysis and

Nelson et al, Computer Forensics and Investigations here

Binary

• A computer number system that consists of 2 numerals, 0 and 1. It is sometimes called base 2.

• Since computers do not have 10 fingers as humans do, all the counting within the computer itself is done using only 2 numerals: 0 and 1 (or "on" and "off" or "false" and "true").

The hexadecimal system

• The hexadecimal system (hex for short) uses numbers from 0 to 15.

• It starts off like the decimal system: 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 but then comes A which equals 10 and then B, C, D, E and F (which of course equals 15). The next number is 10 which is actually 16 in decimal and so on....

• Because it can be impossible to distinguish between a hex and a decimal number it is customary to put a lower case h after each hex number.

ASCII

• ASCII stands for American Standard Code for Information Interchange.

– defined in 1965 to allow computers to exchange information, regardless of the manufacturer.

• The ASCII character set consists of 128 decimal numbers, ranging from 0 through 127, assigned to letters, numbers, punctuation marks and the most common special characters.

– number 65 represents a capital A.• The character set is called 7-bit ASCII.

– All computer systems also use numbers 129 through 255 to represent additional characters but this list is not really standardized

• The competing standard which has virtually disappeared is EBCDIC, developed by IBM.

• Because 256 characters are not sufficient to represent all characters used in Asian languages, a newer standard has emerged. The "Unicode" character set contains more than 32000 characters.

ASCII conversion

Head – Device that reads and writes data to the drive.

Tracks – Individual circles on a disk platter where data is located.

Cylinder – Column of tracks on two or more disk platters.

Sector – Individual section on a track.

Hard Disk Drive Overview

Hard disks

• Hard disk drives are organised as a set of concentric disks or platters

• Each platter has 2 surfaces

Hard disks

More on disk structure

• Data is stored on the concentric circles known as tracks• Corresponding tracks on all platter surfaces make up a

cylinder• The cylinder can be written to without any movement of

the head assembly• Numbering starts with 0 at the outermost cylinder

Hard disk structure

• Rigid disk surface• Head-disk assembly must be sealed and

micro-filtered• Multiple platters provide more storage

without proportional increase in cost

Major components of disk drive

• Typical disk stores 512 bytes per sector• Manufacturer specifies how many sectors per track• Number of bytes is determined by multiplying number of

cylinders (platters) by the number of heads (actually tracks) and by the number of sectors (groups of 512 or more bytes)

• This is known as the CHS (cylinders, heads, sectors)• More modern method is LBA mode and sector number

LBA – Logical Block Addressing

• Large IDE disks (Integrated Disk Electronic using ATA interface) will return c=16383, h=16, s=63 (16514064 sectors and 7.8 GB) but give size in LBA capacity

• The BIOS must know it has to use the LBA to calculate the actual size of the drive in accessible sectors

• Eg. If the LBA values of a disk is 156,301,488 it has a capacity of 156,301,488 * 512 = 80 GB

More on CHS

• Most Intel mother boards use an ATA (Advanced Technology Attachment) interface which connects to the hard disk

• The BIOS will read the disk’s cylinders, heads and sectors through the ATA interface and will use the CHS sector size to determine the size of disk and how it should be accessed

• Older BIOSes used 24 bit addressing which could only address up to 8.4GB (2^24 *512 bytes)

• Newer ones can access 64 bit addressing = trillion times as large as an 8.4 GB drive

Other issues

• Zoned Bit Recording

• How manufacturers deal with the fact that the inner tracks of a platter are physically smaller than the outer tracks. Grouping the tracks by zones ensures that the tracks are all the same size.

• Track Density

• The space between tracks on a disk. The smaller the space between the tracks, the more tracks on a disk. Older drives with wider track densities allow wandering.

Other issues

• Areal Density

• The number of bits per square inch on a platter.

Partitioning (more later)

• Partition – A logical drive on a disk. It can be the entire disk or a portion thereof.

• Inner-Partition Gap – Partitions created with unused space or voids between the primary partition and the first logical partition.

• Volume – a collection of partitions a user can write to or read from

Hard Disk Data Acquisition

• We have not yet begun to look at specific operating systems – see later

• We do need to consider how we might begin to analyse data found on a hard disk

• Most common to do this as ‘dead’ analysis • Therefore we need to understand how hard disk data

can be acquired• We also need sound scientific basis

General Procedure

• Logical to copy one byte from source to destination – like copying a document by hand letter at a time

• Mostly we copy whole words since we read and remember them

• Computers do same thing and copy multiples of 512 bytes

• If the copying tool finds an error on the source it may write zeros to the destination

Data Acquisition and Tools

• Layers of abstraction of data– Disk– Volume– File– Application

• Acquire data at lowest level of abstraction where there may be evidence

• Acquire every sector of disk• NIST has major project on tools and testing• Much work in Australia• http://www.cftt.nist.gov/disk_imaging.htm

Examples

• If acquire at volume level, make a copy of every sector in every partition– Cannot acquire data that is in sector that are not

allocated to partitions

– May lose hidden data

• If we use a backup utility and copy allocated files– Lose deleted files, temporal data,

– Lose data hidden in partitions

• If we copy only files in event of – eg. IDS log files if not compromised – cannot get lower level evidence

General Acquisition Theory for source data

• Why a general theory?• Scientific method?• Quality (Proof or Truth) = Validity + Generalisation +

Reliability + Objectivity’ • Two methods to access data on disk

1. OS or acquisition software accesses hard disk directly

2. OS or acquisition software accesses the hard disk via the BIOS – CHS and LBA issues mean last GB may be missed – may be wrongly configured

General Acquisition Theory for source data

• Dead versus live acquisition– ‘dead refers to state of OS– ‘live’ OS is still running– In live suspect may have modified OS or may contain rootkit

• Error handling– Bad sectors should be logged and write 000 for data that

cannot be read

Write Blocking

• In data acquisition important not to damage source data by writing over it

• Use of hardware write blocker by law enforcement• Read commands are passed to hard disk but write

commands are not• Software write blockers exist and work by modifying the

interrupt table for a given BIOS service – INT13h is read or write to disk

• When INT13h is called write blocker code is executed

Writing Output data from Source

• Destination most commonly is hard disk or CD ROM• File is called an “image”• Some image formats are proprietary• Image files may be compressed and then worked on

with specific analysis tools

Integrity hashes

• When source data has been copied to destination a cryptographic hash must be calculated to show later that the data has not been changed

• A cryptographic hash based on known algorithms such as MD5 are formulas that generate very large numbers based on the input data.

• If any bit of the input data is changed the output data is changed dramatically

Integrity hashes

• A hash stored with an image will not ensure image has not been modified, as the hash may have been tampered with too

• Must store the hash separately to image, ideally in auditable fashion, perhaps digitally signed or threshold-encrypted so that it is not easy/possible to falsify hash

• Hashes can also show accuracy of acquisition process and that source was not modified

How shall we acquire data?

• Start with basics • dd in linux• Copies a chunk of data from one file to another

# dd if=file1.dat of=file2.dat bs=512

2+0 records in

2+0 records out

• If = input file, of= output file• 2 full blocks read • If not full final block then 2 +1 instead of 2 +0

Cryptographic hash in Linux

• Either need to use another utility such as md5sum• Or else use a version of dd that can calculate hashes of

file being copied –dccl-dd available at http://sourceforge.net/projects/biatchux

dfldd

• If you want to use this and image a Linux hard disk and calculate hashes for every 1 MB you would use

• # dfldd if= /dev/had of=mnt/had.dd bs=2k hashwindow= 1M hashlog= /mnt/had.hashes

• Output in hashlog gives– Hash value for that range of bytes (partitioned)

DD for Windows

• Available in cygwin• Or at Forensic Acquisition Utilities – George M Garner

Jr– Dd.exe– Md5lib.dll– Md5sum.exe– Wipe.exe

Use of dd.exe and md5sum.exe

• dd.exe if=\\.\PhysicalDrive0 of=d:\images\PhysicalDrive0.img --md5sum --verifymd5 --md5out=d:\images\PhysicalDrive0.img.md5

• dd if=\\?\Volume{87c34910-d826-11d4-987c-00a0b6741049} of=d:\images\e_drive.img –md5sum –verifymd5 md5out=d:\images\PhysicalDrive0.img.md5

• dd.exe if=\\.\PhysicalMemory of=d:\images\PhysicalMemory.img bs=4096 --md5sum --verifymd5 --md5out=d:\images\PhysicalMemory.img.md5

• dd.exe if=\\.\D: of=d:\images\d_drive.img conv=noerror --sparse --md5sum --verifymd5 –md5out=d:\images\d_drive.img.md5 --log=d:\images\d_drive.log

• dd.exe if=myfile.txt.gz of=d:\images\myfile.txt conv=noerror,decomp --md5sum --verifymd5 –md5out=d:\images\myfile.txt.img.md5 --log=d:\images\myfile.txt.log

• dd.exe if=\\.\D: of=d:\images\d_drive.img.gz conv=noerror,comp --md5sum --verifymd5 –md5out=d:\images\d_drive.img.md5 log=d:\images\d_drive.log

• md5sum.exe -o d_drive.md5 \\.\D: • md5sum.exe -c d_drive.img.md5 • md5sum.exe -d zlib -c d_drive.img.gz.md5