azr412

67
Windows Azure Storage Tables, What Are They Good For? Stephen Moir AZR412

Upload: linette-adelia-powell

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Windows Azure Storage Tables, What Are They Good For?

Stephen Moir AZR412

What is it good for?

What is it not good

at?

How I design tables

Backups and

emulators

ATS

What is it good for?

Built to handle web scale data

http://www.flickr.com/photos/matthewfield

Built to handle web scale transactions

Account20,000 / Second

Partition A2,000 / Second

Partition C2,000 / Second

Partition B2,000 / Second

Partition D2,000 / Second

Small operational overhead

Absolutely cheap, relatively cheap

Azure Table Storage

9.5c/GB/Month

Windows Azure SQL Database

$9.99/GB9.5c/GB

1GB = 37 Transactions/s

ec

9.5c/GB

Can be used on most development stacks

SDKs are open source

github.com/WindowsAzure

No fixed schema

PK: 39D016D5, RK: “User”, Name: “Fred Smith”, Email: “kdj;lsaf89322daf==“, Location: “Sky City”PK: 39D016D5, RK: “Order-1”, OrderDate: “2013-08-22 17:01”, Value: 134.45, Status: “Complete”PK: 39D016D5, RK: “Order-1-1”, Product: “Widget”, Quantity: 6, Each: 3.05, Extended: 18.15PK: 39D016D5, RK: “Order-1-2”, Product: “Top Quality Foo”, Quantity: 1, Each: 50.00, Extended: 50.00PK: 39D016D5, RK: “Order-1-3”, Product: “Iron Bar”, Quantity: 3, Each: 22.10, Extended: 66.30PK: 6A3B9635, RK: “User”, Name: “Jane Brown”, Email: “poiuj7686HDESde==“, Location: “Sky City”PK: 6A3B9635, RK: “Order-2”, OrderDate: “2013-08-27 09:11”, Value: 244.20, Status: “Processing”,Tax:2.00PK: 6A3B9635, RK: “Order-2-1”, Product: “Regular Foo”, Quantity: 3, Each: 30.00, Extended: 90.00PK: 6A3B9635, RK: “Order-2-2”, Product: “Steel Bar”, Quantity: 4, Each: 37.30, Extended: 149.20PK: 6A3B9635, RK: “Order-2-3”, Product: “Delivery”, Quantity: 1, Each: 5.00, Extended: 5.00, Tax: 2.00

New stuff is cool

NEW!

What is it not good at?

ATS is not a relational database

RDBMS

Only one index per tablePartitionKey = X AND RowKey = A

PartitionKey = X AND RowKey >= A AND RowKey < C

PartitionKey = X

PartitionKey >= X AND PartitionKey < Z

PartitionKey >= X AND PartitionKey < ZAND RowKey >= A AND RowKey < C

Slow Ad Hoc QueriesFROM

MyTableWHERE

WasntImportantYesterday= “is important today”

Fetching non sequential keys is slowFROM

MyTableWHERE

(PartitionKey = ‘PK1’ AND RowKey = ‘RK1’)OR (PartitionKey = ‘PK2’ AND RowKey =

‘RK2’)OR (PartitionKey = ‘PK3’ AND RowKey =

‘RK3’)OR (PartitionKey = ‘PK4’ AND RowKey =

‘RK4’)

Query latency is relatively high

Limited transaction supportMust be in the same tableMust all have the same Partition KeyNo more than 100 entities

Limited data type supportStringInteger (32/64 bit)BooleanDate/Time64 bit FloatGUIDByte Array

JSON/XML serialised objects

Binary serialised objects

Aggregate Queries

SELECT SUM(Price) FROM MyTable

foreach (var item in myQuery){ price += item.Price;}

How I design tables

StevesMovieSite.com©®™*

Get the public to tell me what movies they like

Add some social network sugar coating

Allow businesses to email people who are close to them based on their favorite movie genre

*This idea is all yours for 5% of revenue

Tell me the movies you like and receive exclusive offers from theatres near you!

Site DesignCreate an account

Login

Edit account details

Add a movie I like and give it a rating

See a list of all of my movies

Find friends

Comment on friend’s movie choices

Start with all data you expect to useUserEmail addressPasswordNameWhere they liveWhat films they likeComments on the films

FilmTitleYear of releaseGenre(s)

Friend List

Design your queries first

Start simple, how will I get a single entity?

UserBy email address

For any of these methods do I only need some of this information?

Create accountEmail Address

User EditEverything

User LoginEmail AddressPassword

All other pagesUser Name

Do I need a list of these entities and in what order?UserPeople who like a genre within X kms from a point, no special orderSearch by name, alphabetical order

Do I need a list of these entities and in what order?FilmsCase insensitive search by title begins with phrase, alphabetical order then by most recent releaseFilms a user likes

Are there any queries that I don’t need but would be cool or useful?FilmsAll comments against a film

Is there any data that has to be saved as a batch?

What do you do with answers to these questions?Any combination of data that is unique is a good candidate for the Partition Key and Row Key

What do you do with answers to these questions?If you have both candidates for both Partition Key and Row Key and you need to list by one of these, that is your Partition Key

What do you do with answers to these questions?If you need to retrieve the same entity more than one way, create a “Secondary Index” table

User entityPartition Key: User IDRow Key: EmptyNameLocationEncrypted Email AddressHashed Password

Example querypublic User GetUser (guid userId){

result =FROM UserWHERE PartitionKey = userId

and RowKey = string.Empty}

GUIDS make good unique IDs

User-by-email-address entityPartition Key: Hashed Email AddressRow Key: Hashed PasswordUser ID

Example querypublic bool IsEmailInUse (string emailAddress){

hashedEmail = emailAddress.Hash();

results =FROM EmailAddressUserWHERE PartitionKey = hashedEmail

return results.Any()}

Example querypublic User LogUserId (string emailAddress, string password){

hashedEmail = emailAddress.Hash()hashedPassword = password.Hash()

results =FROM EmailAddressUserWHERE PartitionKey = hashedEmail

and RowKey = hashedPassword

result = result.First()

if result = null return null

return User.GetUser(result.UserId)}

Film EntityPartition Key: Normalised Film TitleRow Key: 10000 - Year of releaseTitleGenre1Genre2Genre3Genre4Genre5Genre6

Example queryFindFilmStartsWith(“?The Italian Job”)

public List<Film> FindFilmStartsWith (string name){

name = name.Upper().Clean() --“THE ITALIAN JOB”

nextName = normalisedName.Next() --“THE ITALIAN JOC”

results =FROM FilmWHERE PartitionKey => normalisedName

and PartitionKey < nextName}

If you’re using a natural key, be careful

/ \ # ?

%

U+007F to U+009F

U+0000 to U+001F

User film review entityPartition Key: User IDRow Key: Normalised Film Title + “-”

+ Year of releaseDate of reviewFilm TitleRatingNumber of comments

Example querypublic UserFilm FilmsForUser (guid userId){

results =FROM UserFilmWHERE PartitionKey = userId

}

You can’t make a partition too small, but you can make it too big

Design for success

User film review entity take 2Partition Key: User ID + “-”

+ Normalised Film TitleRow Key: Year of releaseDate of reviewFilm TitleRating

Example queryFilms = FilmsForUser(“48928593-…6325B”)

public UserFilm FilmsForUser(guid userId){

nextUser = userId.Next() -- 48928593-…6325C

results =FROM UserFilmWHERE PartitionKey >= userId

and PartitionKey < nextUser}

User-by-name entityPartition Key: Normalised First Name + “-”

+ User IDRow Key: Normalised LastFirst NameLast Name

Example queryUsers = UserNameStartsWith (“Fre”, “Smi”)

public List<UserName> UserNameStartsWith(string firstName, string lastName){

firstName = firstName.Upper().Clean() -- FREnextFirstName = firstName.Next() -- FRF

lastName = lastName.Upper().Clean() -- SMInextLastName = lastName.Next() -- SMJ

FROM UserFilmWHERE PartitionKey >= firstName

and PartitionKey < nextFirstNameand RowKey >= lastNameand RowKey < nextLastName

}

My friendsPartition Key: User ID + “-”

+ Friend’s Normalised First NameRow Key: Friend’s Normalised Last Name + “-”

+ User IDFriend’s First NameFriend’s Last Name

Example queryFilms = FriendsForUser(“48928593-…6325B”)

public List<Friend> FriendsForUser(guid userId){

nextUser = userId.Next() -- 48928593-…6325C

results =FROM FriendWHERE PartitionKey >= userId

and PartitionKey < nextUser}

Comments on my filmsPartition Key: Normalised Film Title + “-”

+ Year of release + “-”+ User ID + “-”+ YYYYMMDDHHmm

Row Key: SSfff + “-”+ Commenting User ID

CommentUser Name

Users who like a genre in an areaPartitionKey: Genre Code + “-”

+ (Latitude + 90) * 100000RowKey: (Longitude + 180) * 100000 + “-”

+ User IDEncrypted Email Address

Users who like a genre in an areapublic List<User> GetUsersByLocation(string location, int distance, string genreCode){ var boundingBox = GetBoundingBox(location, distance);

results = FROM UserGenre WHERE PartitionKey >= genreCode + "-" + Math.Round((boundingBox.South + 90) *100000, 0) && PartitionKey < genreCode + "-" + Math.Round((boundingBox.North + 90) *100000, 0) && RowKey >= Math.Round((boundingBox.West + 180) *100000, 0) && RowKey <= Math.Round((boundingBox.East + 180) *100000, 0))}

Bing Maps

Don’t do now what you can put off until later

Make sure you can update other indexes

Partitioning is not just about Partition Keys

Account

Table

Partition

Backups and emulators

Replicas aren’t backups

North SouthThanks NASA for the picture of the asteroid. You rock!

The storage emulator is just an emulator

Related contentAzure Storage Team Blog

Find Me Later Near The Vista Coffee Stand

Cloud Cover Episode 43: Scalable Counters

Every Windows Azure session you attendgives youa chanceto win thisepic t-shirt.We’re givingaway one perAzure session.

Sharks with freakin’

lasers as a Service.

Only on Windows Azure.

Evaluate this session and you could win instantly!

Head to...aka.ms/te

© 2013 Microsoft Corporation. All rights reserved.Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.