toward a general collection methodology for android...
TRANSCRIPT
-
Toward a general collection methodology for Android devices
Timothy Vidas a,*, Chengye Zhang b, Nicolas Christin b
aCarnegie Mellon ECE/CyLab, USAbCarnegie Mellon INI/CyLab, USA
Keywords:
Android framework
Mobile devices
Digital forensics
Collection
Acquisition
a b s t r a c t
The Android platform has been deployed across a wide range of devices, predominately
mobile phones, bringing unprecedented common software features to a diverse set of
devices independent of carrier and manufacturer. Modern digital forensics processes
differentiate collection and analysis, with collection ideally only occurring once and the
subsequent analysis relying upon proper collection. After exploring special device boot
modes and Android’s partitioning schema we detail the composition of an Android boot-
able image and discuss the creation of such an image designed for forensic collection. The
major contribution of this paper is a general process for data collection of Android devices
and related results of experiments carried out on several specific devices.
ª 2011 Vidas, Zhang & Christin. Published by Elsevier Ltd. All rights reserved.
1. Introduction
Traditionally, mobile forensics requires procedures that are
very specific to device manufacturer and/or model for both
collection and analysis. Not only do mobile phones employ
a diversity of cables, interfaces and form factors, but the
devices also have unique software, memory layouts and
storage techniques. This amazing diversity has led to the
digital forensics practitioner being assaulted with complex
kits containing a plethora of cables and data collection tech-
niques (Yates, 2010). Mobile devices, mobile phones in
particular, are ever present in today’s society containing
a wealth of information for the analyst.
At the end of 2010, 31% of U.S mobile consumers owned
smartphones, with all of the top ten selling phones being
smartphones (Factsheet). Android, the Google-backed mobile
software framework, is enjoying a larger market share and
growth factor than Apple’s iPhone (Android, 2010; Meeker
et al., 2010), quickly making Android a major player in the
mobile market. Many existing manufactures produce Android
based devices and all major carriers sell said devices. Indeed,
Android can now be found in the living room powering
GoogleTV, in upcoming Chevy and Ford vehicles and Sony’s
new “PlayStation Phone:” Xperia Play. Smartphone prices are
declining, shrinking the market share of the less powerful
“feature phones” which may also eventually be powered by
the free Android software.
The ubiquity of devices utilizing the Android framework
facilitates exploiting common properties to minimize the
diversity required of digital forensics tools while simulta-
neously maximizing the potential for sound data collection.
Manufactures and carriers tend to preserve competitive
advantage by adding additional features to and providing
additional services throughmobile devices, yet Android based
devices share a common framework that we utilize to perform
collection (sometimes called “acquisition”). To our knowledge,
this paper represents the first work done toward a general
collection method for Android based devices.
2. Related work
While the adoption of smartphones is rising, mobile phones
themselves are not a particularly new technology. The array of
* Corresponding author.E-mail addresses: [email protected] (T. Vidas), [email protected] (C. Zhang), [email protected] (N. Christin).
ava i lab le a t www.sc iencedi rec t .com
journa l homepage : www.e lsev ie r . com/ loca te /d i in
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4
1742-2876/$ e see front matter ª 2011 Vidas, Zhang & Christin. Published by Elsevier Ltd. All rights reserved.doi:10.1016/j.diin.2011.05.003
-
mobile phones and pricing plans available today is indicative
of the wide desires and needs of the user base, and must be
copedwith by digital forensics practitioners. In this sectionwe
present a progression of work related to digital forensics on
mobile devices from what would today be considered a cheap
“feature phone” to other work particularly targeting modern
smartphones.
Willassen (2003) outlines many items of interest for GSM
based mobile phone analysis: location, SMS, contacts, etc.
(refer to Appendix I for a brief overview of this data as it
pertains to Android). In 2006 the same author explored
collection methods for commodity mobile phones, in partic-
ular the use of physical access to the circuit board interface
(e.g., JTAG port) or physically removingmemory chips for later
data collection via a chip programmer. Also in 2006, Casadei
et al. (2006) introduced a live collection technique, dubbed
SIMbrush, facilitating entire file system collection (for
unprotected files) particular to SIM devices.
In 2007, Al-Zarouni (2007) investigated the use of mobile
phone flashing tools in regard to digital forensics. The author
concluded that while forensically sound, the primary use of
the tool was to write to, not read from, the device. Read
capability varied widely across device brands and models and
consequently flashing tools may prove problematic as
a means of collection.
Also in 2007, Mokhonoana and Olivier (2007) detailed
a collection method for Symbian OSv7 devices, and Dis-
tenfano in 2008 explored a SymbianOSv8 collectionmethod in
(Distefano and Me, 2008). Indeed, due to the vast diversity in
the mobile space, it is very common for research to be scoped
to particular operating systems or hardware platforms such as
Blackberry (Fairbanks et al., 2009), CDMA (Murphy, 2009),
iPhone 3Gs (Bader and Baggili, 2010), Nokia (Williamson et al.,
2006), etc.
The demand for mobile forensics combined with the
diversity of the mobile device market has led to a myriad of
mobile forensics tools. In 2006, Ayers et al. (2007) compared
existing tools according to their acquisition, examination and
reporting functions concluding that typical mobile phone
information such as the IMEI and SMS/MMS could be discov-
ered by existing tools. The same year Williamson et al. (2006)
studied the performance of mobile forensic tools particular to
Nokia phones. In 2007, Jansen and Ayers (2007) again
compared existing tools on contemporary mobile phones and
collected their work into a NIST report. Later, in 2010, Yates
(2010) notes the diversity of the mobile device market and
the associated complexity presented to a practitioner
attempting to select the appropriate digital forensics tool. Just
the comparison papers mentioned here cover: Cell Seizure,
GSM.XRY, MOBILedit! Forensic, TULP 2G, Forensic Card
Reader, ForensicSIM, SIMCon, SIMIS and Oxygen Phone
Manager.
Specifically addressing Android devices, in 2009 Hoog
(2009) discussed Android forensics including collection
methods for Androidmobile phones such as retrieving files on
an active phone using an application debugging feature
provided in the Android SDK, commercial tools such as Par-
aben’s Device Seizure and “rooting” the HTC G1. Rooting is
common vernacular for gaining administrative (root) access to
a device where the user is intended to only have unprivileged
access, such as amobile phone.We explore the rooting further
in Section 5.2.
In 2010, Thing et al. (2010) explore livememory forensics on
Android to collect process memory. Thing utilized the PTRA-
CE_ATTACH method of invoking ptrace to trace existing
processes. This technique is often used for debugging (Sarma
and Vaddagiri, 2002) but often repurposed for malware anal-
ysis (Burdach, 2006).
3. Background
Built upon a Linux kernel, Android uses operating system
primitives (such as processes and user IDs) as well as a Java
Virtual Machine (Dalvik) to isolate apps providing a safety
sandbox (Shin et al., 2009).
Android applications are typically written in JAVA and
interact with the Android framework through a well defined
API. For performance reasons, developers can create software
that runs natively on the hardware avoiding overhead
induced by the JAVA virtual machine. An SDK (Software
Development Kit) and NDK (Native Development Kit) are
made available for application development (not intended for
lower level development such as kernel modifications). The
SDK makes high level software development very approach-
able by providing features such as a full featured emulator, an
Eclipse (the preferred Integrated Development Environment)
add-on specific to Android and a special Android Debug Bridge
(adb) to enable debugging information from an emulator
instance or USB connected physical device (Android
developers).
The Android Debug Bridge consists of software compo-
nents on the device (or emulator instance) and on the devel-
oper’s machine connected via USB or TCP. Using the feature,
the developer can not only observe debug information, but
also perform an assortment of other actions such as installing
software (bypassing theMarket application) reboot the device,
and even open an interactive remote shell. Note that adb is
typically not enabled in production devices and must be
enabled by the user.1 For low level experimentation, Android
is open source, the code can be easily obtained and compiled
(Android Source).
Some vendors, such the manufacturer of the G1 phone:
HTC, embrace developers with a rich website providing not
only documentation but also pre-built tools and even phone
images (HTC developer center). Contrarily, other manufac-
tures never release such information to the public and protect
phone images closely as coveted intellectual property.
For the rest of this paper we assume a more strict case
when the device is “obstructed” via a screen lock mechanism
as defined in Jansen and Ayers (2007). While the technique
described in the paper will work equally as well on an
“unobstructed” device and is capable of a more comprehen-
sive collection, it may be deemed simpler to simply enable adb
on an unobstructed device and perform an unprivileged,
logical collection. Such a collection would strictly contain less
information due to the inaccessibility of unallocated storage
1 via Settings e Applications e Development e USB Debugging,or similar.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4 S15
-
locations and permissions enforced by the running Android
system (the adb process does not executewith root privileges).
4. Collection objectives
Since our data collection is intended for forensics purposes,
we must consider several constraints and desirable qualities
on the technique. Ideally, an “exact copy” of the device and it’s
data can be obtained, even though by simply interacting with
the device we alter it’s state somewhat2 we strive to obtain as
close to an exact copy as possible (Dobell). Here we enumerate
desirable criteria for the collection process.
� Data Preservation. Data storage locations that house userdata are of primary importance, as this type of data is likely
to be more valuable to investigation than system files
common to many devices. This view does not completely
discount the collection of storage less likely to house user
data, it only places more value on user data. The ability to
capture system information may, in fact, be very useful in
tandem with user data. Though quite likely less common,
particular cases may dictate that cache information, firm-
ware and kernel crash information be critical. Again, ideally,
everything can be collected as an “exact copy.”
� Atomic Collection. Data should be collected as atomically aspossible with respect to the device being collected. If
a device is currently executing instructions and manipu-
lating storage the collected state may not be a valid.
Consider a disk that is defragmenting while an external
process is copying the same disk. Given that the copy will
take some amount of time, it’s perfectly reasonable that
a file is moved by defragmentation to a cluster that has
already been copied by the external process. In this case, the
copywill not contain the file! A simple solution is to copy the
disk while no other actions are being performed to the disk.
Similar situationsmay arise relating to the validity of the file
systemwhenmeta information such as allocation tables are
altered during the copy process.
� Correctness. In relation to the atomicity criteria mentionedabove, there is an obvious need for correctness. Even given
the ability to copy atomically with respect to the device, the
data must be copied correctly. Software must truly copy the
data from source to destination and integrity must be
preserved in transit.
� Determinism. The process must be repeatable so that thepractitioner has an expectation that the process will indeed
collect the data in question. Subsequent collections on the
same device, in the same state, should ideally produce
identical results.
� Usability. The processmust be usable, and occur in a feasibleamount of time.
Depending on the hardware characteristics of the device,
other precautions may be required. For example, somehow
jamming or otherwise preventing a mobile phone from
receiving network communication may be desirable as
incoming data will certainly change the state of the phone,
potentially deleting valuable data.
5. Collection process
Our technique repurposes the recovery partition and associ-
ated recovery mode of an Android device for collection
purposes. For any device, collection is going to be a multi-step
process that requires a collection recovery image. An outline on
how to create such an image is outlined in Section 5.3.
Once a recovery image has been obtained, it is flashed to
the device using the device specific instructions such as those
outlined in Section 6 below. After a collection recovery image
is loaded on the device, the device is rebooted into recovery
mode and connected to a computer that has adb (from the
Android SDK) installed. The adb program can then be used to
verify that the device is connected./adb devices), and remotely
execute programs from the recovery image now executing on
the device (./adb shell).
At this point the recommended procedure for collection is
to port forward TCP ports from the device using adb, start
a receiving process on the computer and transfer data from
the storage devices to the local device TCP port using a data
dumping utility and a simple program that writes the output
of the data dumping utility to a socket. The TCP transfer
software written by the authors also calculates an integrity
hash as data are written to the socket. The display of an
integrity hash allows for the verification of correct transfer by
checking the hash displayed in the adb shell with one calcu-
lated independently on the collected image on the computer.
The data dumping utility employed depends somewhat on
the characteristics of the device. Many Android based devices
utilize Memory Technology Devices (MTD). The MTD system is
“an abstraction layer for raw flash devices” (MTD) that allows
software to utilize a single interface to access a variety of flash
technologies. For MTD devices, nand dump3 can be used to
collect NAND data independent of the higher-level filesystem
deployed on the memory. For devices that do not employ MTD
other collection techniquesmustbeemployed. Forexample, the
dd utility can be used to copy data. It is also important to note
that not all data are necessarily stored in onboardmemory. It is
very common for Android devices to support one or more
SDcards. Not only can the user elect to store some applications
and data on such a device, somemanufacturers may choose to
store the entire user data partition on this media.
5.1. Android partitioning
Android devices typically consist of several partitions typi-
cally mapped to MTD devices. Exact partitioning schema
depends upon vendor implementation, but a typical scheme
can be found in Table 1. There are typically six partitions
found on Android devices the most common being system,
user data, cache, boot, and recovery. As also seen in the table,
many Android based devices utilize the YAFFS2 (Yet Another2 In Farmer and Venema (2005) this is dubbed the “Heisenberg
principle of data gathering in systems” after the famous physicistand associated principle.
3 A NAND dump utility written in 2000 by David Woodhouseand Steven Hill.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4S16
-
Flash File System 2) file systemwhich was designed for use on
flashmemory. Newer devicesmay be found to utilize the EXT4
file system (Paul, 2010). As one may expect, the “bootimg”
denoted in Table 1 is a “bootable image” which is detailed
further in Section 5.3. SD card locations, sometimes marketed
as “internal” or “external” are both typically identified by/dev/
block/mmcblkXpYwhere X is the card ID and Y is the partition
ID on the card. The SD card device is typically mounted to/
sdcard or/mnt/sdcard.
Those interested in exploring the forensic analysis on
Android devices will likely be most interested in the user data
and system partitions. It is important to note that during
normal operation, no user data is stored in the recovery
partition, so corruption or overwriting of data in this partition
is unlikely to change content on the device that may subse-
quently be relied upon in court. The recovery partition, and
associated recovery boot mode, are critical to the collection
technique explored in this paper.
5.2. To root or not to root
Some have suggested via presentations (Hoog, 2009), blogs
posts (Computer forensics), or mobile phone “modding”
forums (Android Rooting) methods of data collection that
require “rooting” a device. Rooting a device typically involves
exploiting a security vulnerability (which is typically device
and software version dependent) with the intention of
installing unsupported software on the device. The motiva-
tion for rooting a device ranges from ideological desire to have
control over the device that the user owns, to circumventing
carrier specific controls preventing the use of particular soft-
ware, to upgrading to a more recent version of Android than
the carrier currently supports (some carriers have very long
update cycles), and many other reasons not mentioned here.
“Rooting” a device for the purpose of forensic collection is not
exemplary for several reasons, among them:
� Rooting a device typically leverages a software flaw oftenparticular to specific model and software versioning on
a device. If the device is locked, an investigator may not be
able to verify software versions running on the device. The
inability to verify software versioning decreases the
changes of successfully rooting the device and increases the
chances of inflicting damage upon the device and/or the
data to be collected.
� Rooting the device alters portions of the device that maystore user data. If avoidable, collection methods should not
change the content on the device that “may subsequently be
relied upon in court (Jansen and Ayers, 2007)”. In some areas
of digital forensics this is unavoidable, for example collec-
tion of memory from a running machine (Vidas, 2006). In
instances where collection can be performed without
modifying the data to be collected, collection should be done
in such a manner (Farmer and Venema, 2005).
� Rooting a device undermines Android’s security model. Arooteddeviceoftenpermits easyescalationofprivilege.With
the device in a normal operating mode, the combination of
easy privilege escalation with typical app execution and
general network access can lead to malicious remote code
executionas shownwith the iPhone in (Reisinger,Nov. 2009).
5.3. Recovery partition
Personal computers commonly allow users to configure BIOS
boot time password passwords or even stronger security with
Trusted Platform Modules (TPM) now available on an esti-
mated 250 million systems (T.C. Group, 2009). The Android
framework is commonly deployed on small devices such as
mobile phones, tablets, and televisions which do not enjoy
any kind of protected boot. Android devices ship with a parti-
tioning scheme similar to those in Table 1 including a recovery
partition. The recovery partition has special properties
ostensibly used for recovery purposes. By booting a device into
“recovery mode,” the normal boot process is circumvented
and the boot target is the bootimg currently loaded in the
recovery partition. As seen Fig. 1, common features found in
a manufacturer installed recovery image include, wiping user
data and updating the device. Similar to the normal operating
mode, the factory recovery mode image commonly does not
support adb, but also does not commonly enable the RF
component of a device. Here we begin to craft a special
collection oriented recovery bootimg. Such a bootimg will
have unfettered access to memory not inhibited by access
control as in Kim et al. (2007) or varied memory access capa-
bilities as in Al-Zarouni (2007).
An Android bootimg consists of header, a kernel, ram disk
(initrd) and optional secondary image each page aligned. The
bootimg header, defined in bootimg.h found in the Android
Source, contains the magic signature “ANDROID!”, an ID field4
and meta information about the size and memory locations in
which to load the kernel, ram disk and secondary image.
The ram disk portion of the bootimg is a compressed (gzip
or lmza) cpio file containing an initial ram disk (initrd) direc-
tory structure for the kernel. This directory structure can be
Table 1 e Partition information typical of an Androiddevice.
Path Name FileSystem
Mountpoint
Description
/dev/mtd/mtd0 pds yaffs2 /config Configuration data
/dev/mtd/mtd1 misc e N/A Memory
Partitioning data
/dev/mtd/mtd2 boot bootimg N/A Bootable
(typical boot)
/dev/mtd/mtd3 recovery bootimg N/A Bootable
(recovery mode)
/dev/mtd/mtd4 system yaffs2 /system System files,
Applications,
Vendor additions,
Read-Only,
/dev/mtd/mtd5 cache yaffs2 /cache Cache Files
/dev/mtd/mtd6 user data yaffs2 /data User data
(Applications)
/dev/mtd/mtd7 kpanic e N/A Crash Log
4 The ID field is typically a SHA1SUM of the kernel, kernel size,ram disk, ram disk size, secondary image, and secondary imagesize used to uniquely identify the bootimg based solely on thefirst page of the bootimg.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4 S17
-
amended to include additional binary programs in order to
modify the behavior of the bootimg. In addition to adding the
desired binary programs, some other modifications from
a standard bootimg are necessary.
A practitioner, using an existing bootimg as a collection
tool, may never intend to create a bootimg from scratch. In
this case it is still useful to understand the file structure of
a bootimg in order to verify the operation of the collection tool.
Using common Linux commands a bootimg may be split into
its respective pieces and the ram disk may be ungzipped and
“unCPIOed” for inspection. Once a custom, collection oriented
recovery image has been created or obtained it must be
“flashed” onto the device.
Proof of concept bootimgs created by the authors include
modifying the ram disk by modifying the default.prop proper-
ties file, init.rc file and addingadbd, su, nanddump, and custom
transfer binaries. The primary modifications to the properties
file include enabling ro.debuggable and persist.service.adb.en-
able to enable the use of adb. The init.rc file is used to start the
adbd service (on property:persist.service.adb.enable¼1 startadbd) andsetpermissivepermissionson theaddedbinaries and
MTD devices to be collected.
When possible, we prefer the use of nand dump over other
data duplication software since nand dump was designed
specifically to dump the contents of flash devices. This allows
for as close to a physical duplication as possible, containing
more information than a file system level copy. There are
currently no procedures for garnering valuable information
from this extra data, but it is prudent to collect such data in
anticipation of future techniques. Having this extra data
causes no harm as subsequent analysis of the collected data
will still provide at least as much information as a file system
level copy would have provided.
It is desirable to create a small set of collection recovery
images that together support a wide range of devices. To this
end, the compilation process for a recovery image should target
the older processor types so that the resulting applications will
correctly execute on all backward compatible devices.
6. Device specifics
While the Android framework does provide some desirable
common qualities, it is unlikely that a single all-
encompassing bootimg that properly handles all devices can
be created. Even though the Android framework provides
a common interface at the application level and presents
a familiar UI to the user, the devices are still unique at the
hardware level employing different connectors, processors,
etc. For this reason bootimg’s that support several, similar
devices can be created, but the creation of a universal bootimg
is improbable.
Devices may be booted into different modes which are
invoked via hardware key combinations during boot. These
modesallowspecial functionality suchas theability toclean the
phone of user data or flash new software onto the device. Each
device has unique physical characteristics: number of physical
keys, touchscreen,SDcardslot,physicalkeyboard,etc.which in
turn cause the key combinations required for special modes to
not be uniform across devices. Table 2 shows key combinations
and their associated modes for a small set of devices.
Fig. 1 e Recovery Mode Motorola Droid booted to a typical
recovery image.
Table 2 e Boot modes for select Android devices.
Device Mode Keycombination
Description
Motorola Droid Flash D-Pad UP
þ powerMode that allows
flashing via
RSD Lite
Motorola Droid Flash camera
þ powerMode that allows
flashing via
RSD Lite
Motorola Droid Recovery power þ x Boot to recoverypartition (then
camera þ volup todisplay menu)
HTC G1 Flash power þ back Fastboot modeHTC G1 Flash power þ camera Boot mode
(switch to
fastboot via ‘back’)
HTC G1 Recovery power þ home Boot to recoverypartition
Samsung
Captivate
Flash volup þ voldn(then insert USB)
Boot to Samsung
“force download”
mode
Samsung
Captivate
Recovery power þ volupþ voldn
Boot to recovery
partition
Samsung
Galaxy Tab
Flash power þ voldn Boot to Samsung“force download”
mode
Samsung
Galaxy Tab
Recovery power þ volup Boot to recoverypartition
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4S18
-
The method of “flashing” the recovery partition will likely
always be somewhat unique to different devices, though
manufactures tendto re-usesimilar sequencesacrossaproduct
line. A lab that regularly encounters Android devicesmay wish
tocompileacomprehensive listofkeycombinationsandmodes
similar to existing lists for computer BIOS’ (Backdoor). Flashing
adevice isadestructiveprocess, thestoragearea that is “flashed
to” will be irreversibly overwritten. The remainder of this
section explores “flashing”methods for several devices, in each
case only the recovery partition is overwritten with all other
existing partition and data within remaining intact.
The low level boot loader and software that powers
flashing modes is designed for specific uses typically reserved
for the carrier ormanufacturer, not the end user. Thesemodes
are not particularly robust in features and the user should
exercise caution when accessing a device in this manner.
Improper use can easily result in hindering data collection.
Similarly, software features like USB negotiation are not as
robust as other devices with which the user may be familiar.
Using flashing software in VMWare, while possibly desirable,
is not possible due to the device not properly negotiating with
virtual machine’s virtual USB controller.
6.1. Example: Motorola Droid
The Motorola Droid is a Verizon device with screen that slides
to the right revealing a full QWERTY keyboard and D-Pad
(Directional Pad). In addition to the keyboard the Droid has
volume up/down, power and a camera button around the
outside edge of the device. The Droid has a build in microSD
card, a port for an additional microSD card, and has an ARM
Cortex A8 550 mHz processor. It originally shipped with
Android 2.0 (DROID).
The Droid has a special flash boot mode (shown in Fig. 2)
that can be entered by holding the camera button while
powering on the device. This special boot mode allows
flashing of the device’s recovery partition. Motorola RSD Lite
software (Windows only) can facilitate flashing of the recovery
partition, but does not accept a bootimg file in it’s native form:
an RSD Lite compatible.sbf file containing the bootimg must
be created. An .sdf file is comprised5 of a header with file
magic and a count of the parts in the file, each part also
contains a header specifying the destination address, flash
size, checksum and, of course, the image to flash, in this case
the bootimg. Once an .sbf file containing the bootimg has been
created the Droid, booted in flash mode, can be attached to
a computer running RSD Lite and the device can be flashed
with the .sbf file (and thus the contained bootimg). While not
strictly required, re-booting the Droid into flash mode while
connected the RSD Lite will allow RSD Lite to register a success
message.
6.2. Example: HTC G1
The HTC G1 (shown in Fig. 3) has a Qualcomm MSM7210A
528 mHz processor (ARMv6 instruction set), a full QWERTY
keyboard, and an external microSD card port. In addition to
the keyboard the G1 has a track ball, and physical volume up/
down, camera, send, home, menu, back, and end/power
buttons. One hardware feature germane to collection is that
the G1’s has a special HTC USBþ Audio port (ExtUSB) in lieu ofthe more common microUSB port (HTC products). If the
special ExtUSB cable that shipped with the device is not
available, a standard miniUSB cable can be used for both
recovery and fastboot modes.
The G1 employs a boot method called fastboot (shown in
Fig. 4). Fastboot requires a fastboot compatible boot loader and
a fastboot program on a personal computer. The fastboot
program can be compiled from Android Source or pre-
compiled versions for Windows, Linux and OSX can conve-
niently be obtained via HTC’s developer website (HTC
developer center). After booting into fastboot mode and con-
necting the device to the computer via the special HTC cable
or miniUSB,6 the fastboot program can be used to enumerate
devices attached (./fastboot devices) and to flash an image to
the device (./fastboot flash recovery bootimg_filename). Fast-
boot also allows directly booting to a kernel and ram disk
located on a connected computer (./fastboot boot kernel_file-
name ramdisk_filename), which may be slightly preferred over
flashing the recovery image as the existing recovery image on
the device remains intact. For consistency in the collection
process we suggest maintaining a set of recovery images and
flashing the recovery image for every collection.
Fig. 2 e Flash Mode Motorola Droid booted to flash mode.
5 The exact structure of an .sbf is not extremely important here,software that can create a well formed .sbf file when provideda bootimg can be found at http://www.ece.cmu.edu/tvidas/.
6 The author observed that the G1 in fastboot mode was notrecognized correctly when connected to a USB3 port. Readersmay wish to specifically use USB2 ports.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4 S19
-
6.3. Example: Samsung Captivate
The Captivate is part of Samsung’s Galaxy S line of mobile
phones, sold by AT&T (shown in Fig. 5). The Captivate has
a larger touch screen than the G1 and Droid, but it also has no
QWERTY keyboard, in fact it only has 3 edge buttons: power,
volume up and volume down. A standard microUSB port can
be found at the top of the device behind a plastic sliding cover.
In addition to typical internal hardware: 1 GHz ARMCortex A8,
512 MB of RAM, and 16 GB internal SD card, the Captivate also
has a hardware graphics core.
Unlike the Droid and the G1 the Captivate employs Sam-
sung’s proprietary RFS (Robust FAT File System) and Sam-
sung’s OneNAND memory (L. Flash and samsung, 2008). This
requires Android to load kernel modules to support RFS and
makes later analysis more difficult as there is no available
software for parsing RFS related data. Instead of using MTD
devices, the kernel modules create several STL (Sector
Translation Layer) and BML7 (Block Management Layer) block
devices (/dev/block/). A partition table showing typical use of
BML devices is shown in Table 3. This type of device compli-
cates collection slightly as it is not possible to read from some
of the higher-layer STL devices, and collecting all of BML
devices, while recommended, is not particularly useful as
there is no way to analyze the resulting image.
Much like the Droid, the Captivate has a special flashmode
(also called download mode, shown in Fig. 6), that can be
entered by holding both volume buttons and then connecting
the device to a computer for flashing. In this mode the phone
can be flashed using an open source tool called Heimdall
(Dobell) or the closed source software Odin (Windows only).
When using Odin prior to flashing, source files must be placed
in a .tar8 archive.Heimdall doesnot requirefiles to bepackaged
as a .tar and is compatible withWindows, Linux andMac OSX.
In addition to the more complex partitioning and file
system structure, Samsung devices do not employ the typical
bootimg structure in the recovery partition. The recovery
image is an initramfs image. Initramfs, available in 2.6.x Linux
kernels, is a root file system that is actually embedded into the
kernel. The details of creating an image suitable for flashing to
a device are slightly more complex than described above, but
the same theory applies.
Fig. 3 e Recovery Mode HTC G1 booted to a typical recovery
image.
Fig. 4 e Fastboot Mode HTC G1 booted to fastboot mode.
7 In addition to the BML, the associated STL should also becollected because this is an RFS partition and at this time the STLdata is easier to analyze than the BML. 8 The .tar file must be POSIX 1003.1e1988 (ustar) format.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4S20
-
The presence of an older Secondary Boot Loader (SBL)9 will
likely not utilize the BML8 recovery partition. Instead the
normal boot mode and recovery mode share the same kernel.
Unfortunately it is very difficult to tell the version of the SBL
without interacting with the device. However the SBL can be
flashed. So if after flashing the collection recovery image, if
recovery mode is not working as expected, one may assume
that an older SBL that does not boot to BML8 is on the device.
The SBL can be flashed using Odin or Heimdall (but it requires
having an exist SBL that is known to work with the device).
Corrupting the SBL will make data collection very difficult
because the device will no longer be able to reach download
mode, as such flashing the SBL should be a last resort.
Note that some devices, such as the Samsung Galaxy Tab
(Tablet), require a special cable. Unlike the HTC ExtUSB where
a miniUSB cable can serve as a substitute, the 30 pin Galaxy
Tab cable that ships with the tablet is the only method of
connecting to the device.
7. Discussion
We have demonstrated a general method for digital forensics
collection on Android devices. Through special boot methods
enabling the use of custom recovery bootimg, data on Android
devices can be collected with very little probability of cor-
rupting user data. Use of the recovery bootimg provides
a consistent, repeatable method of collecting numerous
Android devices without “rooting” the device in normal
operating mode. We feel that this recovery bootimgmethod is
both safer and had less impact to data likely to be useful for
analysis.
Collection recovery images have been created for testing for
thedevicesdetailed inSection6. The collectionprocess involves
calculating integrity hashes at the source and destination
helping ensure the correctness of the collection.Data contained
in the collected images was verified using standard a Linux
distribution with MTD and yaffs2 support (see Appendix I).
Where possible, we employed the use of a NAND dumping tool
which collects more data than typical filesystem copy would
collect. While current analysis techniques do not take advan-
tage of this extra information, future techniques may.
Most devices transfer data at approximately 4.3 MB/s
allowing for full collection to occur in a nominal amount of
Fig. 5 e Recovery Mode Samsung Captivate booted to
a typical recovery image.
Fig. 6 e Download Mode Samsung Captivate booted to
a typical download mode for flashing.
Table 3 e Partition information typical of a Samsungdevice.
Device Name Mount point Description
bml1 boot e Primary boot loader
bml2 pit e Partition map data
bml3 efs /efs Unknown.
bml4 SBL e Secondary boot loader
bml5 download e Download Mode
bml67 param /mnt/.lfs Unknown (lfs)
bml7 kernel N/A kernel þ initramfsbml8 recovery N/A kernel þ initramfsbml97 system /system Typical/system data (RFS)
bml107 dbdata /dbdata dbcache (RFS)
bml117 cache /cache cache (RFS)
bml12 modem e Modem software
9 Prior to approx. Oct 2010.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4 S21
-
time. Even though the collection is not atomic, execution is
restricted to the recovery partition and other partitions are not
alteredduring collection resulting in an “exact copy” of original
contents for all partitions other then the recovery partition.
Though no user studies have been performed, we feel that
the solution is very approachable and could be adopted by
practitioners. When thought of simply as a collection tool, the
tool can easily be inspected for correctness.
8. Future work
The software installed in collection bootimgs could easily be
extended to further aide the practitioner. Themenu presented
on the screen when a bootimg is executed could have related
menu options such as “transfer data” eliminating the need to
run./adb shell on the collection computer. Similarly integrity
hashes for collected partitions could be displayed on the
device screen. Bymoving this functionality to the device there
is less risk of user error, especially if a single computer is used
to perform collection of several Android devices.
A comprehensive list of boot modes for Android devices,
and associated flashing tools, should be created in order to
have a reference in place prior to the need for collection on
a particular device. Similarly, a comprehensive set of boot-
imgs supporting all Android devices should be created,
maintained and tested.
Acknowledgments
This work is supported in part by CyLab at Carnegie Mellon
under grant DAAD19-02-1-0389 from the Army Research
Office, and by the National Science Foundation under ITR
award CCF-0424422 (Team for Research in Ubiquitous Secure
Technology) and IGERT award DGE-0903659 (Usable Privacy
and Security), as well as a hardware donation by Google Inc.
Appendix I. Analysis of collected images
This appendix is not intended to be a comprehensive method
of analysis, but only to serve as a brief means of verifying the
collected images. Perhaps most important is the preparation
of an analysis system, but even given a complete analysis
system, locations and methods of data interpretation are left
for future research. The methods detailed below are pertinent
to a yaffs2 based device.
Analysis system preparation
System preparation steps must only be performed once per
system. Mostly this consists of installing several packages if
they are not already installed. In Fedora these packages are
titled mtd-utils, sqlite and kernel-devel10 (only if you want to
build the yaffs2 kernel module yourself).
The most difficult part is the compilation of the yaffs2
kernel module, which is needed to mount yaffs2 partitions for
earlier Android devices. To compile your own kernel module
copy the kernel sources to a working directory, apply the
patch supplied with the yaffs2 source,11 ensure MTD devices
are built as modules, and building the modules. A shell log
would look something like:
cp/usr/lib/
../yaffs2/patch-ker.sh
cd
menuconfig
make modules
make M¼fs/yaffs2 modulesdepmod
modprobe yaffs
In menuconfig (or by editing the .config file) you need to
ensure that MTD,MTD_CHAR,MTD_BLOCK,MTD_NAND, and
MTD_NAND_NANDSIM are all set to “m.”
Installing the module may require slightly different syntax
depending on your kernel configuration, such as make M¼fs/yaffs2 install_modules and it may be needed to copy the
compiledmodule to the kernelmodules directory cp fs/yaffs2/
yaffs.ko/lib/modules/kernel/fs/yaffs/yaffs.ko (check permis-
sions to match other directories). Note that you don’t need to
recompile, and use the entire kernel.
Perhaps the easiest method may be to download
a prepared VM from: http://www.ece.cmu.edu/tvidas/.
Analysis Initialization
Analysis initialization must occur each time a device
analyzed. Modules must be loaded:
modprobe mtd
modprobe mtdchar
modprobe mtdblock
modprobe nandsim first_id_byte¼0x20\second_id_byte¼0xac third_id_byte¼0x00\fourth_id_byte¼0x15modprobe yaffs
The nandsim (NAND simulator) options determine the
characteristics of the simulator such as 512 MB device with
2048 byte pages, if you need to specify other sizes see Table 4
for more options.12 Nandsim creates a pair of related devices/
dev/mtd0 and/dev/mdtblock0.
For each collected image, write the data to the virtual MTD
device:
nandwrite-a-o/dev/mtd0 [dumpfile]
Finally mount the mtd device:
10 Make sure you obtain the source for the kernel you intend torun, in Fedora you can look in the/boot directory for config* filesfor all installed kernels.
11 Available at http://www.aleph1.co.uk/yaffs2.12 NAND size and page size vary widely among manufactures,see http://www.linux-mtd.infradead.org/nand-data/nanddata.html for more information.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4S22
-
mount-t yaffs2-o ro/dev/mtdblock0\
/some/mount/point
Now the partitions can be browsed and easily inspected
from/some/mount/point.
Common mobile phone data locations
It is important to note that Android “smart phones” are more
complex than, and typically contain much more information
than, a standard mobile phone, and the data locations
described here are by not means comprehensive or universal.
Recall that you will need to determine the partition associated
with the original mount point for/data similar to Table 1.
Phone contacts, and call log data can be found at:
/data/data/com.android.providers.\
contacts/databases/contacts2.db
Calendar information:
/data/data/com.android.providers.\
calendar/databases/calendar.db
SMS and MMS messages:
/data/data/com.android.providers.\
telephony/databases/mmssms.db
Gmail and gtalk data:
/data/data/com.google.android.providers.\
gmail/databases/mailstore.cmu.android.\
.db
Each of these databases are SQLite3 databases, the easiest
way to ‘export’ data is via the sqlite3:
sqlite3/data/data/com.android.providers.\
calendar/databases/calendar.db .dump\
>calender_raw.txt
But in many cases it may be more useful to actually
leverage SQL:
sqlite3 com.android.providers.telephony\
/databases/mmssms.db ‘select address,\
person, body, protocol, subject, body\
from sms’
A wealth of information is available in SQLite3 databases
(Android’s method of structured storage) both for packaged
and user installed apps, but analysts should be aware
that apps may elect to store information using proprietary
methods.
r e f e r e n c e s
Al-Zarouni M. Introduction to mobile phone flasher devices andconsiderations for their use in mobile phone forensics. InProceedings of the 5th Australian digital forensics conference,Dec. 2007.
Android developers. http://developer.android.com/.[Q] Android rooting [Archive] e xda-developers. http://forum.xda-
developers.com/archive/index.php/t-866622.html.Android most popular operating system in U.S. among recent
smartphone buyers nielsen wire, http://blog.nielsen.com/nielsenwire/online_mobile/android-most-popular-operating-system-in-u-s-among-recent-smartphone-buyers/; Oct. 2010.
Ayers R, Jansen W, Moenner L, Delaitre A. Cell phone forensictools: an overview and analysis update, NIST interagencyreport (IR) 7387; March 2007.
Backdoor and default passwords. http://www.freelabs.com/wwhitis/-security/backdoor.html.
Bader M, Baggili I. iPhone 3GS forensics: logical analysis usingapple iTunes backup utility. Small Scale Digital DeviceForensics Journal Sept. 2010;4(1).
Burdach M. Finding digital evidence in physical memory. LasVegas, NV: Black Hat; 2006.
Casadei F, Savoldi A, Gubian P. Forensics and SIM cards: anoverview. International Journal of Digital Evidence 2006;5(1).
Computer forensicse forumse general discussionemobile phoneforensics e Android/root access. http://www.forensicfocus.com/index.php?name¼Forums&-file¼viewtopic&t¼5779.
Distefano A, Me G. An overall assessment of mobile internalacquisition tool. Digital Investigation 2008;5:S121e7.
Dobell A. Heimdall website. https://github.com/Benjamin-Dobell/Heimdall.
DROID by motorola e Android phone e tech specs e motorolamobility, inc. USA. http://www.motorola.com/Consumers/US-EN/Consumer-Product-and-Services/Mobile-Phones/ci.Motorola-DROID-US-EN.alt.
Factsheet: The U.S. media universe nielsen wire. http://blog.nielsen.com/nielsenwire/online_mobile/factsheet-the-u-s-media-universe/.
Fairbanks K, Atreya K, Owen H. BlackBerry IPD parsing for opensource forensics. IEEE; 2009. pp. 195e199.
Farmer D, Venema W. Forensic discovery. Addison-Wesley; 2005.Hoog A. Android forensics. Mobile Forensics World 2009;29.Htc e developer center. http://developer.htc.com/adp.html.Htc e products e T-Mobile g1-specification. http://www.htc.com/
www/product/g1/specification.html.Jansen W, Ayers R. Guidelines on cell phone forensics. NIST
Special Publication 2007;800:101.Kim K, Hong D, Chung K, Ryou J. Data acquisition from cell phone
using logical approach. Proceedings of the World Academy ofScience, Engineering and Technology 2007;26.
L. Flash Software Group, Samsung Electronics Co. Linustoreiisamsung onenand flash linux device driver e gpl compliance,July 2008.
Table 4 e Nandsim options.
Desired Size Byte Specifier
NAND(MB)
Page (b) 1 2 3 4
16 512 0 � 20 0 � 33 e e32 512 0 � 20 0 � 35 e e64 512 0 � 20 0 � 36 e e128 512 0 � 20 0 � 78 e e256 512 0 � 20 0 � 71 e e64 2048 0 � 20 0 � a2 0 � 00 0 � 15128 2048 0 � ec 0 � a1 0 � 00 0 � 15256 2048 0 � 20 0 � aa 0 � 00 0 � 15512 2048 0 � 20 0 � ac 0 � 00 0 � 151024 2048 0 � ec 0 � d3 0 � 51 0 � 95
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4 S23
-
Meeker M, Devitt S, Wu L. Ten questions Internet execs shouldask & answer; Nov. 2010. San Fransisco, CA.
Mokhonoana P, Olivier M. Acquisition of a symbian smartphoneâ�s content with an on-phone forensic tool; 2007.
Memory technology device (mtd) subsystem for linux. http://www.linux-mtd.infradead.org/faq/nand.html.
Murphy C. The fraternal clone method for CDMA cell phones.Small Scale Digital Device Forensics Journal June 2009;3(1).
Paul R. Ext4 filesystem hits Android, no need to fear data loss,http://arstechnica.com/open-source/news/2010/12/ext4-filesystem-hits-android-no-need-to-fear-data-loss.ars; 2010.
Reisinger D. Another iPhone worm, but this one is serious thedigital home e CNET news. CNET News; Nov. 2009.
Sarma D, Vaddagiri S. Debugging multiple threads or processes;Mar. 2002.
Shin W, Kiyomoto S, Fukushima K, Tanaka T. Towards formalanalysis of the Permission-Based security model for Android.IEEE; 2009. pages 87e92.
T.C. Group. Trusted Computing Group (TCG) timeline; Apr. 2009.Thing V, Ng K, Chang E. Live memory forensics of mobile phones.
Digital Investigation 2010;7:S74e82.Vidas T. The acquisition and analysis of random access memory.
Journal of Digital Forensic Practice 2006;1(4):315e23.Welcome to Android open source. http://source.android.com/.Willassen S. Forensics and the GSM mobile telephone system.
International Journal of Digital Evidence 2003;2(1):1e17.Williamson B, Apeldoorn P, Cheam B, McDonald M. Forensic
analysis of the contents of nokia mobile phones; 2006. page 36.Yates I. Practical investigations of digital forensics tools for
mobile devices. ACM; 2010. pages 156e162.
d i g i t a l i n v e s t i g a t i o n 8 ( 2 0 1 1 ) S 1 4eS 2 4S24
Toward a general collection methodology for Android devices1 Introduction2 Related work3 Background4 Collection objectives5 Collection process5.1 Android partitioning5.2 To root or not to root5.3 Recovery partition
6 Device specifics6.1 Example: Motorola Droid6.2 Example: HTC G16.3 Example: Samsung Captivate
7 Discussion8 Future work Acknowledgments Appendix I Analysis of collected imagesAppendix I Analysis system preparationAppendix I Analysis InitializationAppendix I Common mobile phone data locations
References