cloud computing with amazon web services, part 2: storage in the cloud with amazon simple storage...

21
Cloud computing with Amazon Web Services, Part 2: Amazon Simple Storage Service (S3) Reliable, flexible, and inexpensive storage and retrieval of your data Skill Level: Introductory Prabhakar Chaganti ([email protected]) CTO Ylastic, LLC. 19 Aug 2008 In this series, learn about cloud computing using Amazon Web Services. Explore how the services provide a compelling alternative for architecting and building scalable, reliable applications. This article delves into the highly scalable and responsive services provided by Amazon Simple Storage Service (S3). Learn about tools for interacting with S3, and use code samples to experiment with a simple shell. Amazon Simple Storage Service Part 1 of this series introduced the building blocks of Amazon Web Services and explains how you can use this virtual infrastructure to build Web-scale systems. In this article, learn more about Amazon Simple Storage Service (S3). S3 is a highly scalable and fast Internet data-storage system that makes it simple to store and retrieve any amount of data, at any time, from anywhere in the world. You pay for the storage and bandwidth based on your actual usage of the service. There is no setup cost, minimum cost, or recurring overhead cost. Amazon provides the administration and maintenance of the storage infrastructure, leaving you free to focus on the core functions of your systems and applications. S3 is an industrial-strength platform that is readily available for your data storage needs. It's great for: Amazon Simple Storage Service (S3) © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 1 of 21

Upload: white-paper

Post on 12-May-2015

6.495 views

Category:

Business


1 download

TRANSCRIPT

Page 1: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

Cloud computing with Amazon Web Services, Part2: Amazon Simple Storage Service (S3)Reliable, flexible, and inexpensive storage and retrieval of yourdata

Skill Level: Introductory

Prabhakar Chaganti ([email protected])CTOYlastic, LLC.

19 Aug 2008

In this series, learn about cloud computing using Amazon Web Services. Explorehow the services provide a compelling alternative for architecting and buildingscalable, reliable applications. This article delves into the highly scalable andresponsive services provided by Amazon Simple Storage Service (S3). Learn abouttools for interacting with S3, and use code samples to experiment with a simple shell.

Amazon Simple Storage Service

Part 1 of this series introduced the building blocks of Amazon Web Services andexplains how you can use this virtual infrastructure to build Web-scale systems.

In this article, learn more about Amazon Simple Storage Service (S3). S3 is a highlyscalable and fast Internet data-storage system that makes it simple to store andretrieve any amount of data, at any time, from anywhere in the world. You pay forthe storage and bandwidth based on your actual usage of the service. There is nosetup cost, minimum cost, or recurring overhead cost.

Amazon provides the administration and maintenance of the storage infrastructure,leaving you free to focus on the core functions of your systems and applications. S3is an industrial-strength platform that is readily available for your data storage needs.It's great for:

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 1 of 21

Page 2: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

• Storing the data for your applications.

• Personal or enterprise backups.

• Quickly and cheaply distributing media and other bandwidth-guzzlingcontent to your customers.

Valuable features of S3 include:

ReliabilityIt is designed to tolerate failures and repair the system very quickly withminimal or no downtime. Amazon provides a service level agreement (SLA) tomaintain 99.99 percent availability.

SimplicityS3 is built on simple concepts and provides great flexibility for developing yourapplications. You can build more complex storage schemes, if needed, bylayering additional functions on top of S3 components.

ScalabilityThe design provides a high level of scalability and allows an easy ramp-up inservice when a spike in demand hits your Web-scale applications.

InexpensiveS3 rates are very competitive with other enterprise and personal data-storagesolutions on the market.

The three basic concepts underpinning the S3 framework are buckets, objects, andkeys.

Buckets

Buckets are the fundamental building blocks. Each object that is stored in AmazonS3 is contained within a bucket. Think of a bucket as analogous to a folder, or adirectory, on the file system. One of the key distinctions between a file folder and abucket is that each bucket and its contents are addressable using a URL. Forexample, if you have a bucket named "prabhakar," then it can be addressed usingthe URL http://prabhakar.s3.amazonaws.com.

Each S3 account can contain a maximum of 100 buckets. Buckets cannot be nestedwithin each other, so you can't create a bucket within a bucket. You can affect thegeographical location of your buckets by specifying a location constraint when youcreate them. This will automatically ensure that any objects that you store within thatbucket will be stored in that geographical location. At this time, you can locate yourbuckets in either the United States or the European Union. If you do not specify alocation when creating the bucket, the bucket and its contents will be stored in the

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 2 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 3: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

location closest to the billing address for your account.

Bucket names need to conform to the following S3 requirements:

• The name must start with a number or a letter.

• The name must be between 3 and 255 characters.

• A valid name can contain only lowercase letters, numbers, periods,underscores, and dashes.

• Though names can have numbers and periods, they cannot be in the IPaddress format. You cannot name a bucket 192.168.1.254.

• The bucket namespace is shared among all buckets from all of theaccounts in S3. Your bucket name must be unique across the entire S3.

Buckets that will contain objects to be served with addressable URLs must conformto the following additional S3 requirements:

• The name of the bucket must not contain any underscores.

• The name must be between 3 and 63 characters.

• The name cannot end with a dash. For example, myfavorite-.bucket.comis invalid.

• There cannot be dashes next to periods in the name. my-.bucket.com isinvalid.

You can use a domain naming convention for your buckets, such asmedia.yourdomain.com, and thus map your existing Web domains or subdomains toAmazon S3. The actual mapping will be done when you add DNS CNAME entries topoint back to S3. The big advantage with this scheme is that you can use your owndomain name in your URLs to download files. The CNAME mapping will beresponsible for translating between the S3 address for your bucket. For example,http://media.yourdomain.com.s3.amazonaws.com becomes the more friendly URLhttp://media.yourdomain.com.

Objects

Objects contain the data that is stored within the buckets in S3. Think of an object asthe file that you want to store. Each object that is stored is composed of two entities:data and metadata. The data is the actual thing that is being stored, such as a PDFfile, Word document, a video file, and so on. The stored data also has associatedmetadata for describing the object. Some examples of metadata are the content typeof the object being stored, the date the object was last modified, and any othermetadata specific to you or your application. The metadata for an object is specifiedby the developer as key value pairs when the object is sent to S3 for storage.

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 3 of 21

Page 4: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

Unlike the limitation on the number of buckets, there are no restrictions on thenumber of objects. You can store an unlimited number of objects in your buckets,and each object can contain up to 5GB of data.

The data in your publicly accessible S3 objects can be retrieved by HTTP, HTTPS,or BitTorrent. Distribution of large media files from your S3 account becomes verysimple when using BitTorrent; Amazon will not only create the torrent for your object,it will also seed it!

Keys

Each object stored within an S3 bucket is identified using a unique key. This issimilar in concept to the name of a file in a folder on your file system. The file namewithin a folder on your hard drive must be unique. Each object inside a bucket hasexactly one key. The name of the bucket and the key are together used to providethe unique identification for each object that is stored in S3.

Every object within S3 is addressable using a URL that combines the S3 serviceURL, bucket name, and unique key. If you store an object with the keymy_favorite_video.mov inside the bucket named prabhakar, that object can beaddressed using the URL http://prabhakar.s3.amazonaws.com/my_favorite_video.mov.

Though the concepts are simple, as shown in Figure 1, buckets, objects, and keystogether provide a lot of flexibility for building your data storage solutions. You canleverage these building blocks to simply store data on S3, or use their flexibility tolayer and build more complex storage and applications on top of S3 to provideadditional functions.

Figure 1. Conceptual view of S3

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 4 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 5: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

Access logging

Each S3 bucket can have access log records that contain details on each request fora contained object. The log records are turned off by default; you have to explicitlyenable the logging for each Amazon S3 bucket that you want to track. An access logrecord contains a lot of detail about the request, including the request type, theresource requested, and the time and date that the request was processed.

The logs are provided in the S3 server access log format but can be easily

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 5 of 21

Page 6: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

converted into Apache combined log format. They can then be easily parsed by anyof the open source or commercial log analysis tools, such as Webalizer, to give youa human readable report and pretty graphs upon request. The reports can be veryuseful to gain insight into your customer base that's accessing the files. SeeResources for tools you can use for easier visualization of the S3 log records.

Security

Each bucket and object created in S3 is private to the user account creating them.You have to explicitly grant permissions to other users and customers for them to beable to see the list of objects in your S3 buckets or to download the data containedwithin them. Amazon S3 provides the following security features to protect yourbuckets and the objects in them.

AuthenticationEnsures that the request is being made by the user that owns the bucket orobject. Each S3 request must include the Amazon Web Services access keythat uniquely identifies the user.

AuthorizationEnsures that the user trying to access the resource has the permissions orrights to the resource. Each S3 object has an access control list (ACL)associated with it that explicitly identifies the grants and permissions for thatresource.You can grant access to all Amazon Web Services users or to a specific useridentified by e-mail address, or you can grant anonymous access to any user.

IntegrityEach S3 request must be digitally signed by the requesting user with anAmazon Web Services secret key. On receipt of the request, S3 will check thesignature to ensure that the request has not been tampered with in transit.

EncryptionYou can access S3 through the HTTPS protocol to ensure that the data istransmitted through an encrypted connection.

NonrepudiationEach S3 request is time-stamped and serves as proof of the transaction.

Each and every REST request made to S3 must go through the following standardsteps that are essential to ensuring security:

• The request and all needed parameters must be assembled into a string.

• Your Amazon Web Services secret access key must be used to create a

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 6 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 7: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

keyed-HMAC (Hash Message Authentication Code) signature hash of therequest string.

• This calculated signature is itself added as a parameter on the request.

• The request is then forwarded to Amazon S3.

• Amazon S3 will check to see if the provided signature is a validkeyed-HMAC hash of the request.

• If the signature is valid, then (and only then) Amazon S3 will process therequest.

Pricing

The charges for S3 are calculated based on three criteria, which are different basedon the geographic location of your buckets.

• The total amount of storage space used, which includes the actual size ofyour data content and the associated metadata. The units used by S3 fordetermining the storage consumed are GB-Month. The number of bytesof storage used by your account is computed every hour, and at the endof the month it's converted into the storage used for the month. The tablebelow shows pricing for storage.Location Cost

United States $0.15 perGB-Month ofstorage used

Europe $0.18 perGB-Month ofstorage used

• The amount of data or bandwidth transferred to and from S3. Thisincludes all data that is uploaded and downloaded from S3. There is nocharge for data transferred between EC2 and S3 buckets that are locatedin the United States. Data transferred between EC2 and European S3buckets is charged at the standard data transfer rate as shown below.Location Cost

United States $0.100 per GB -all data transferin$0.170 per GB -first 10TB /month datatransfer out

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 7 of 21

Page 8: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

$0.130 per GB -next 40TB /month datatransfer out$0.110 per GB -next 100TB /month datatransfer out$0.100 per GB -data transferout / month over150TB

Europe $0.100 per GB -all data transferin$0.170 per GB -first 10TB /month datatransfer out$0.130 per GB -next 40TB /month datatransfer out$0.110 per GB -next 100TB /month datatransfer out$0.100 per GB -data transferout / month over150TB

• The number of application programming interface (API) requestsperformed. S3 charges fees per each request that is made using theinterface—for creating objects, listing buckets, listing objects, and so on.There is no fee for deleting objects and buckets. The fees are once againslightly different based on the geographic location of the bucket. Thefollowing table shows pricing for API requests.Location Cost

United States $0.01 per 1,000 PUT, POST,or LISTrequests$0.01 per10,000 GETand all otherrequestsNo charge fordelete requests

Europe $0.012 per

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 8 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 9: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

1,000 PUT,POST, or LISTrequests$0.012 per10,000 GETand all otherrequestsNo charge fordelete requests

Check Amazon S3 for the latest price information. You can also use the AWSSimple Monthly Calculator for calculating your monthly usage costs for S3 and theother Amazon Web Services.

Getting started with Amazon Web Services and S3

To start exploring S3, you will first need to sign up for an Amazon Web Servicesaccount. You will be assigned an Amazon Web Services account number and willget the security access keys along with the x.509 security certificate that will berequired when you start using the various libraries and tools for communicating withS3.

All communication with any of the Amazon Web Services is through either the SOAPinterface or the query/REST interface. The request messages that are sent througheither of these interfaces must be digitally signed by the sending user to ensure thatthe messages have not been tampered with in transit, and that they are reallyoriginating from the sending user. This is the most basic part of using the AmazonWeb Services APIs. Each request must be digitally signed and the signatureattached to the request.

Each Amazon Web Services user account is associated with the following securitycredentials:

• An access key ID that identifies you as the person making requeststhrough the query/REST interface.

• A secret access key that is used to calculate the digital signature whenyou make requests through the query interface.

• Public and private x.509 certificates for signing requests andauthentication when using the SOAP interface.

You can manage your keys and certificate, regenerate them, view account activityand usage reports, and modify your profile information from Web Services Accountinformation.

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 9 of 21

Page 10: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

After you successfully sign up for the Amazon Web Services account, you need toenable Amazon S3 service for your account using the following steps:

1. Log in to your Amazon Web Services account.

2. Navigate to the S3 home page.

3. Click on Sign Up For This Web Service on the right side of the page.

4. Provide the requested information and complete the sign-up process.

Examples in this article use the query/REST interface to communicate with S3. Youare going to need to obtain your access keys. You can access them from your WebServices Account information page by selecting View Access Key Identifiers. Youare now set up to use Amazon Web Services, and have enabled S3 service for youraccount.

Interacting with S3

To learn about interacting with S3, you can use existing libraries available fromAmazon or from third parties and independent developers. This article does notdelve into the details of communication with S3, such as how to sign requests, howto build up the XML documents used for encapsulating the data, or the parameterssent to and received from S3. We'll let the libraries handle all of that for us, and usethe higher-level interface they provide. You can review the S3 developer guide formore details.

You'll use an open-source Java™ library named JetS3t to explore S3, and learnabout its API by viewing small snippets of code. By the end of the article you'll collectand organize these snippets into something useful: a simple and handy S3 shell thatyou can use at any time to experiment and interact with S3.

JetS3t

JetS3t is an open source Java toolkit for interacting with S3. It is more than just alibrary. The distribution includes several very useful S3 related tools that can beused by typical S3 users as well as service providers who build applications on topof S3. JetS3t includes:

CockpitA GUI for managing the contents of an Amazon S3 account.

SynchronizeA command-line application for synchronizing directories on your computer withan Amazon S3 account.

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 10 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 11: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

GatekeeperA servlet that you can use to mediate access to Amazon S3 accounts.

CockpitLiteA lighter version of Cockpit that routes all its operations through a mediatinggatekeeper service.

UploaderA GUI that routes all its operations through a mediating gatekeeper service andcan be used by service providers to provide access to their S3 accounts forcustomers.

Download the latest release of JetS3t.

You can, of course, use one of these GUI applications for interacting with S3, butthat won't be very helpful if you need to develop applications to interface with S3.You can download the complete source code for this article as a zipped archive,including a ready-to-go Netbeans project that you can import into your workspace.

Connecting to S3

JetS3t provides an abstract class named org.jets3t.service.S3Service thatmust be extended by classes that implement a specific interface, such as REST orSOAP. JetS3t provides two implementations you can use for connecting andinteracting with S3:

• org.jets3t.service.impl.rest.httpclient.RestS3Servicecommunicates with S3 through the REST interface.

• org.jets3t.service.impl.soap.axis.SoapS3Servicecommunicates with S3 through the SOAP interface using Apache Axis1.4.

JetS3t uses a file named jets3t.properties to configure various parameters that areused while communicating with S3. The example in this article uses the defaultjets3t.properties that is shipped with the distribution. The JetS3t configuration guidehas a detailed explanation of the parameters.

In this article you'll use the RestS3Service to connect to S3. A new RestS3Serviceobject can be created by providing your Amazon Web Services access keys in theform of an AWSCredentials object. Keep in mind that the code snippets in thisarticle are for demonstrating the API. To run each snippet, you have to ensure thatall the required class imports are present. Refer to the source in the downloadpackage for the right imports. Or, even simpler, you can import the providedNetbeans project into your workspace for easy access to all of the source code.

Listing 1. Create a new RestS3Service

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 11 of 21

Page 12: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

String awsAccessKey = ”Your AWS access key”;String awsSecretKey = “Your AWS Secret key”;

// use your AWS keys to create a credentials objectAWSCredentials awsCredentials = new AWSCredentials(awsAccessKey,awsSecretKey);

// create the service object with our AWS credentialsS3Service s3Service = new RestS3Service(awsCredentials);

Managing your buckets

The concept of a bucket is encapsulated by theorg.jets3t.service.model.S3Bucket, which extends theorg.jets3t.service.model.BaseS3Object class. This class is the parentclass for both buckets and objects in the JetS3t model. Each S3Bucket objectprovides a toString(), in addition to various accessor methods, that can be usedto print the salient information for a bucket (name and geographical location of thebucket, date the bucket was created, owner’s name, and any metadata associatedwith the bucket).

Listing 2. List buckets

// list all buckets in the AWS account and print info for each bucket.S3Bucket[] buckets = s3Service.listAllBuckets();for (S3Bucket b : buckets) {

System.out.println(b);}

You can create a new bucket by providing a unique name for it. The namespace forbuckets is shared by all the user accounts, so sometimes finding a unique name canbe challenging. You can also specify where you want the bucket and the objects thatit will contain to be physically located.

Listing 3. Create buckets

// create a US bucket and print its infoS3Bucket bucket = s3Service.createBucket(bucketName);System.out.println("Created bucket - " + bucketName + " - " + bucket);

// create a EU bucket and print its infoS3Bucket bucket = s3Service.createBucket(bucketName,S3Bucket.LOCATION_EUROPE);System.out.println("Created bucket - " + bucketName + " - " + bucket);

You have to delete all the objects contained in the bucket prior to deleting the bucketor an exception will be raised. The RestS3Service class you have been using isfine for dealing with single objects. When you start dealing with multiple objects, itmakes more sense to use a multithreaded approach to speed things up. JetS3tprovides the org.jets3t.service.multithread.S3ServiceSimpleMulticlass just for this purpose. You can wrap the existing s3Service object using this

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 12 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 13: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

class and take full advantage of those multiprocessors. It comes in handy when youneed to clear a bucket by deleting all the objects it contains.

Listing 4. Delete a bucket

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, “my bucket”);

// delete a bucket – it must be empty firsts3Service.deleteBucket(bucket);

// create a multi threaded version of the RestServiceS3ServiceSimpleMulti s3ServiceMulti = new S3ServiceSimpleMulti(s3Service);

// get all the objects from bucketS3Object[] objects = s3Service.listObjects(bucket);

// clear the bucket by deleting all its objectss3ServiceMulti.deleteObjects(bucket, objects);

Each bucket is associated with an ACL that determines the permissions or grants forthe bucket and the level of access provided to other users. You can retrieve the ACLand print the grants that are provided by it.

Listing 5. Retrieve ACL for bucket

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, “my bucket”);

// get the ACL and print itAccessControlList acl = s3Service.getBucketAcl(bucket);System.out.println(acl);

The default permissions on newly created buckets and objects make them private tothe owner. You can modify this by changing the ACL for a bucket and granting agroup of users permission to read, write, or have full control over the bucket.

Listing 6. Make a bucket and its content public

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, “my bucket”);

// get the ACLAccessControlList acl = s3Service.getBucketAcl(bucket);

// give everyone read accessacl.grantPermission(GroupGrantee.ALL_USERS, Permission.PERMISSION_READ);

// save changes back to S3bucket.setAcl(acl);s3Service.putBucketAcl(bucket);

You can easily enable logging for a bucket and retrieve the current logging status.After logging is enabled, detailed access logs for each file in that bucket are stored

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 13 of 21

Page 14: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

in S3. Your S3 account will be charged for the storage space that is consumed bythe logs.

Listing 7. Logging for S3 buckets

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, “my bucket”);

// is logging enabled?S3BucketLoggingStatus loggingStatus =s3Service.getBucketLoggingStatus(bucketName);System.out.println(loggingStatus);

// enable loggingS3BucketLoggingStatus newLoggingStatus = new S3BucketLoggingStatus();

// set a prefix for your log filesnewLoggingStatus.setLogfilePrefix(logFilePrefix);

// set the target bucket namenewLoggingStatus.setTargetBucketName(bucketName);

// give the log_delivery group permissions to read and write from the bucketAccessControlList acl = s3Service.getBucketAcl(bucket);acl.grantPermission(GroupGrantee.LOG_DELIVERY, Permission.PERMISSION_WRITE);acl.grantPermission(GroupGrantee.LOG_DELIVERY,Permission.PERMISSION_READ_ACP);bucket.setAcl(acl);

// save the changed ACL for the bucket to S3s3Service.putBucketAcl(bucket);

// save the changes to the bucket loggings3Service.setBucketLoggingStatus(bucketName, newLoggingStatus, true);System.out.println("The bucket logging status is now enabled.");

Managing your objects

Each object contained in a bucket is represented by theorg.jets3t.service.model.S3Object. Each S3Bucket object provides atoString() that can be used to print the important details for an object:

• Name of the key

• Name of the containing bucket

• Date the object was last modified

• Any metadata associated with the object

It also provides methods for accessing the various properties of an object along withits metadata.

Listing 8. List objects

// list objects in a bucket.S3Object[] objects = s3Service.listObjects(bucket);

// print out the object details

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 14 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 15: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

if (objects.length == 0) {System.out.println("No objects found");

} else {for (S3Object o : objects) {

System.out.println(o);}

}

You can filter the list of objects that are retrieved by providing a prefix to match.

Listing 9. Filter the list of objects

// list objects matching a prefix.S3Object[] filteredObjects = s3Service.listObjects(bucket, “myprefix”, null);

// print out the object detailsif (filteredObjects.length == 0) {

System.out.println("No objects found");} else {

for (S3Object o : filteredObjects) {System.out.println(o);

}}

Each object can have associated metadata, such as the content type, date modified,and so on. You can also associate your application-specific custom metadata withan object.

Listing 10. Retrieve object metadata

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, bucketName);

// getobjects matching a prefixS3Object[] filteredObjects = s3Service.listObjects(bucket, “myprefix”, null);

if (filteredObjects.length == 0) {System.out.println("No matching objects found");

}else {

// get the metadata for multiple objects.S3Object[] objectsWithHeadDetails = s3ServiceMulti.getObjectsHeads(bucket,

filteredObjects);

// print out the metadatafor (S3Object o : objectsWithHeadDetails) {

System.out.println(o);}

}

Each newly created object is private by default. You can use JetS3t to generate asigned URL that anyone can use for downloading the object data. This URL can becreated to be valid only for a certain duration, at the end of which it automaticallyexpires. The object is still private, but you can give the URL to anyone to let themdownload it for a brief time.

Listing 11. Generate a signed URL for object downloads

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 15 of 21

Page 16: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, bucketName);

// how long should this URL be valid?int duration = Integer.parseInt(tokens.nextToken());Calendar cal = Calendar.getInstance();cal.add(Calendar.MINUTE, duration);Date expiryDate = cal.getTime();

// create the signed urlString url = S3Service.createSignedGetUrl(bucketName, objectKey,

awsCredentials, expiryDate);System.out.println("You can use this public URL to access this file for thenext "

+ duration + " min - " + url);

S3 allows a maximum of 5GB per object in a bucket. If you have objects that arelarger than this, you'll need to split them up into multiple files, each 5GB in size, andthen upload all of the parts to S3.

Listing 12. Upload to S3

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, bucketName);

// create an object with the file dataFile fileData = new File(“/my_file_to_upload”);S3Object fileObject = new S3Object(bucket, fileData);

// put the data on S3s3Service.putObject(bucket, fileObject);System.out.println("Successfully uploaded object - " + fileObject);

JetS3t provides a DownloadPackage class that makes it simple to associate thedata from an S3 object to a local file and automatically save the data to it. You canuse this feature to easily download objects from S3.

Listing 13. Download from S3

// get the bucketS3Bucket bucket = getBucketFromName(s3Service, bucketName);

// get the objectS3Object fileObject = s3Service.getObject(bucket, fileName);

// associate a file with the object dataDownloadPackage[] downloadPackages = new DownloadPackage[1];downloadPackages[0] = new DownloadPackage(fileObject,

new File(fileObject.getKey()));

// download objects to the associated filess3ServiceMulti.downloadObjects(bucket, downloadPackages);System.out.println("Successfully retrieved object to current directory");

This section covered some of the basic functions provided by the JetS3t toolkit, andhow to use them to interact with S3. See Resources for more about S3 service andan in-depth discussion of the JetS3t toolkit.

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 16 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 17: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

S3 Shell

The interaction thus far with S3, through small code snippets, can be put into a moreuseful and longer lasting form by creating a simple S3 Shell program that you canrun from the command line. You'll create a simple Java program that accepts theAmazon Web Services access key and secret key as parameters and returns aconsole prompt. You can then type a letter or a few letters, such as b for listingbuckets or om for listing objects that match a certain prefix. Use this program forexperimentation.

The shell program contains a main() that is filled out with an implementation usingthe snippets of code you're using in this article. In the interest of space, the codelisting for S3 Shell is not included here. The complete S3 Shell source code, alongwith its dependencies, is in the download. You can run the shell by simply executingthe devworks-s3.jar file.

Listing 14. Running the S3 Shell

java -jar devworks-s3.jar my_aws_access_key my_aws_secret_key

You can type h at any time in the S3 Shell to get a list of supported commands.

Figure 2. Help in the S3 Shell

Some of the more useful methods have been added to the S3 Shell. You can extendit to add any other functions you want to make the shell even more useful to your

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 17 of 21

Page 18: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

specific case.

Summary

In this article you learned some of the basic concepts behind Amazon's S3 service.The JetS3t toolkit is an open source library you can use to interact with S3. You alsolearned how to create a simple S3 Shell using sample snippets of code, so you cancontinue to experiment easily and simply with S3 using the command line.

Stay tuned for the next article in this series, which will explain how to use AmazonElastic Compute Cloud (EC2) to run virtual servers in the cloud.

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 18 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 19: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

Downloads

Description Name Size Download method

Sample code for this article devworks-s3.zip 2.93MB HTTP

Information about download methods

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 19 of 21

Page 20: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

Resources

Learn

• Learn about specific Amazon Web Services:

• Amazon Simple Storage Service (S3)

• Amazon Elastic Compute Cloud (EC2)

• Amazon Simple Queue Service (SQS)

• Amazon SimpleDB (SDB)

• The Service Health Dashboard is updated by the Amazon team regardingany issues with the services.

• Sign up for an Amazon Web Services account.

• The Amazon Web Services Developer Connection is the gateway to all thedeveloper resources.

• Read the blog to find out the latest happenings in the world of Amazon WebServices.

• From the Web Services Account information page you can manage your keysand certificate, regenerate them, view account activity and usage reports, andmodify your profile information.

• S3 Technical Resources has Amazon Web Services technical documentation,user guides, and other articles of interest.

• Amazon S3 has the latest pricing information. Use the AWS Simple MonthlyCalculator tool for calculating your monthly usage costs for S3 and the otherAmazon Web Services.

• Review the S3 Developer Guide for more details.

• Amazon Service Level Agreement (SLA) for S3.

• The S3stats resource page has several links on processing and viewing S3 logrecords. Logs are in the S3 Server Access Log Format, but can be easilyconverted into Apache Combined Log Format, then easily parsed by any of theopen source or commercial log analysis tools such as Webalizer.

• Learn about JetS3t, an open source Java toolkit for Amazon S3, developed byJames Murty. See the toolkit documentation, and get detailed explanations ofparameters in the configuration guide.

• In the Architecture area on developerWorks, get the resources you need toadvance your skills in the architecture arena.

developerWorks® ibm.com/developerWorks

Amazon Simple Storage Service (S3)Page 20 of 21 © Copyright IBM Corporation 1994, 2008. All rights reserved.

Page 21: Cloud Computing With Amazon Web Services, Part 2: Storage in the Cloud With Amazon Simple Storage Service (S3)

• Browse the technology bookstore for books on these and other technical topics.

Get products and technologies

• Download JetS3t and other tools.

• Download IBM product evaluation versions and get your hands on applicationdevelopment tools and middleware products from IBM® DB2®, Lotus®,Rational®, Tivoli®, and WebSphere®.

Discuss

• Check out developerWorks blogs and get involved in the developerWorkscommunity.

About the author

Prabhakar ChagantiPrabhakar Chaganti is the CTO of Ylastic, a start-up that is building a single unifiedinterface to architect, manage, and monitor a user's entire AWS Cloud computingenvironment: EC2, S3, SQS and SimpleDB. He is the author of two recent books,Xen Virtualization and GWT Java AJAX Programming. He is also the winner of thecommunity choice award for the most innovative virtual appliance in the VMwareGlobal Virtual Appliance Challenge.

Trademarks

IBM, the IBM logo, ibm.com, DB2, developerWorks, Lotus, Rational, Tivoli, andWebSphere are trademarks or registered trademarks of International BusinessMachines Corporation in the United States, other countries, or both. These and otherIBM trademarked terms are marked on their first occurrence in this information withthe appropriate symbol (® or ™), indicating US registered or common lawtrademarks owned by IBM at the time this information was published. Suchtrademarks may also be registered or common law trademarks in other countries. Acurrent list of IBM trademarks is available on the Web athttp://www.ibm.com/legal/copytrade.shtmlJava and all Java-based trademarks and logos are trademarks of Sun Microsystems,Inc. in the United States, other countries, or both.

ibm.com/developerWorks developerWorks®

Amazon Simple Storage Service (S3)© Copyright IBM Corporation 1994, 2008. All rights reserved. Page 21 of 21