tales from the field

39
Adam Comerford Senior Solutions Engineer, MongoDB @comerford #MongoDBLondon Tales from the Field

Upload: mongodb

Post on 14-Jun-2015

580 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Tales from the Field

Adam ComerfordSenior Solutions Engineer, MongoDB

@comerford #MongoDBLondon

Tales from the Field

Page 2: Tales from the Field

Or:

●Cautionary Tales●Don’t solve the wrong

problems●Bad schemas hurt ops too●etc.

Page 3: Tales from the Field

● Are (mostly) true, and (mostly) actually happened

● Names have been changed to protect the (mostly) innocent

● No animals were harmed during the making of this presentation ○ Perhaps a few DBAs and engineers had light

emotional scarring● Some of the people that inspired the stories may

well be here today at MongoDB London

The Stories

Page 4: Tales from the Field

Story #1: Bill the Bulk Updater● Bill built a system that tracked status information

for entities in his business domain

● State changes for this system happened in batches:o Sometimes 10% of entities get updatedo Sometimes 100% get updated

● Essentially, lots of random updates

Page 5: Tales from the Field

Bill’s Initial Architecture

Application / mongosmongod

Page 6: Tales from the Field

What about production?

● Bill’s system was a success!

● The product grew, and the number of entities increased by a factor of 5

● Not a problem - add more shards!

Page 7: Tales from the Field

Bill’s Eventual Architecture

Application / mongos

…16 more shards…

mongod

Page 8: Tales from the Field

Linear Scaling

● Bill’s cluster scaled linearly, as intended

● But, Bill’s TCO scaled linearly too

● More growth was forecast

Page 9: Tales from the Field

Large Cluster, Large Expense● Entity growth predicted at 10x

● Rough calculations called for ~200 shards

● Linear scaling of cost

Page 10: Tales from the Field

What problem did Bill overlook?● Horizontal Scaling = Linear Scaling

● Not necessarily the most efficient option

Page 11: Tales from the Field

The “Golden Hammer” Tendency

Page 12: Tales from the Field

What did we recommend?

● Scale the random I/O vertically, not horizontally

● Sometimes a combination of vertical & horizontal scaling is the best approach

Page 13: Tales from the Field

Bill’s Final Architecture

Application / mongosmongod SSD

Page 14: Tales from the Field

Story #2: Gary the Game Developer● Gary was launching a AAA game title

● MongoDB would provide the backend for the player’s online experience

● Launched worldwide, same day, midnight launches

Page 15: Tales from the Field

Complex Cloud Deployment

● Deploying in the cloud, but very beefy instances

● 32 vCPU, 244GiB RAM, 8 x SSD

● Single mongod unable to stress instances

● Hence “Micro-Sharding” required to get most out of instances

Page 16: Tales from the Field

Micro-What?

HOST1

Primary1

Primary2

Primary3

Secondary4

Secondary5

Secondary6

Secondary7

Secondary8

Secondary9

HOST2

Secondary1

Secondary2

Secondary3

Primary4

Primary5

Primary6

Secondary7

Secondary8

Secondary9

HOST3

Secondary1

Secondary2

Secondary3

Secondary4

Secondary5

Secondary6

Primary7

Primary8

Primary9

Micro-Sharding is the practice of deploying multiple relatively small (hence “micro”) shards on large hosts to better take advantage of available resources which are difficult to utilise with a single mongod instance.

For example, 9 shards evenly distributed across 3 hosts, as below:

Page 17: Tales from the Field

● Load tested

● Failover and Backups tested

● Procedures, architecture reviewed

● Basically, lots of testing/reviewing was done (all passed)

Extensive Pre-Production Testing

Page 18: Tales from the Field

However…….

HOST1

Primary1

Primary2

Primary3

Secondary4

Secondary5

Secondary6

Secondary7

Secondary8

HOST2

Secondary1

Secondary2

Secondary3

Primary4

Primary5

Primary6

Secondary7

Secondary8

HOST3

Secondary1

Secondary2

Secondary3

Secondary4

Secondary5

Secondary6

Primary7

Primary8

The production layout of mongod processes actually was 8 shards on 3 host, reproduced below. This layout caused a problem in production. But, it was tested and had no issues, right?

Almost: the backup process was tested, and load was tested, but not together…..

Page 19: Tales from the Field

The Backup Process

HOST1

Primary1

Primary2

Primary3

Secondary4

Secondary5

Secondary6

Secondary7

Secondary8

HOST2

Secondary1

Secondary2

Secondary3

Primary4

Primary5

Primary6

Secondary7

Secondary8

HOST3

Secondary1

Secondary2

Secondary3

Secondary4

Secondary5

Secondary6

Primary7

Primary8

Backups took place on a single host (host 2 below).

The databases were locked, then an LVM snapshot was taken, the lock was released.

This was almost instantaneous in pre-prod testing (no load), not so in production.

Page 20: Tales from the Field

Backup Under Load

HOST1

Primary1

Primary2

Primary3

Secondary4

Secondary5

Secondary6

Secondary7

Secondary8

HOST2

Secondary1

Secondary2

Secondary3

Primary4

Primary5

Primary6

Secondary7

Secondary8

HOST3

Secondary1

Secondary2

Secondary3

Secondary4

Secondary5

Secondary6

Primary7

Primary8

Once load was introduced to the equation, the snapshots were no longer instantaneous. This essentially caused the primaries to become unresponsive but not fail over on the host taking the backup

Which eventually caused a cascading failure, bringing the whole cluster down

Page 21: Tales from the Field
Page 22: Tales from the Field

What did we recommend?

HOST1

Primary1

Primary2

Primary3

Primary4

Secondary5

Secondary6

Secondary7

Secondary8

HOST2

Secondary1

Secondary2

Secondary3

Secondary4

Secondary5

Secondary6

Secondary7

Secondary8

HOST3

Secondary1

Secondary2

Secondary3

Secondary4

Primary5

Primary6

Primary7

Primary8

New process layout proposed, as below, backups still taken on Host2.

The database lock was not necessary because LVM snapshot gives point in time, removed.

Also put some limits on max connections, just in case

Page 23: Tales from the Field

No one single cause:● Small issue with deployment layout● Small error with backup process● Lack of integration with testing plan● Relatively new system● Some bad luck

Led to:● Large outage, slow cautious recovery

Summary

Page 24: Tales from the Field

Story #3: Rita the Retailer

Rita the Retailer had an ecommerce site, selling diverse goods in 20+ countries.

Page 25: Tales from the Field

{

_id: 375

en_US : { name : ..., description : ..., <etc...> },

en_GB : { name : ..., description : ..., <etc...> },

fr_FR : { name : ..., description : ..., <etc...> },

de_DE : ...,

de_CH : ...,

<... and so on for other locales... >

}

Product Catalog: Original Schema

Page 26: Tales from the Field

What’s good about this schema?

● Each document contains all the data about a given product, across all languages/locales

● Very efficient way to retrieve the English, French, German, etc. translations of a single product’s information in one query

Page 27: Tales from the Field

However……

That is not how the product data is actually used

(except perhaps by translation staff)

Page 28: Tales from the Field

db.catalog.find( { _id : 375 } , { en_US : true } );

db.catalog.find( { _id : 375 } , { fr_FR : true } );

db.catalog.find( { _id : 375 } , { de_DE : true } );

... and so forth for other locales ...

Dominant Query Pattern

Page 29: Tales from the Field

Which means……

The Product Catalog’s data model did not fit the way the

data was accessed.

Page 30: Tales from the Field

Consequences

● Each document contained ~20x more data than any common use case needed

● MongoDB lets you request just a subset of a document’s contents (using a projection), but…

o Typically the whole document will get loaded into RAM to serve the request

● There are other overheads for reading from disk into memory (like readahead)

Page 31: Tales from the Field

Therefore…..

Less than 5% of data loaded into RAM from disk is actually required at the time - highly inefficient

Page 32: Tales from the Field

{ _id: 42, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > }

<READAHEAD OVERHEAD>

{ _id: 709, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > }

<READAHEAD OVERHEAD>

{ _id: 3600, en_US : { name : ..., description : ..., <etc...> }, en_GB : { name : ..., description : ..., <etc...> }, fr_FR : { name : ..., description : ..., <etc...> }, de_DE : ..., de_CH : ..., <... and so on for other locales... > }

Visualising the problem

- Data in RED are loaded into RAM and used.

- Data in BLUE take up memory but are not required.

- Readahead padding in GREEN makes things even more inefficient

Page 33: Tales from the Field

More RAM? It’s not that simple

Page 34: Tales from the Field

What did we recommend?

● Design for your use case, your dominant query pattern

o In this case: 99.99% of queries want the product data for exactly one locale at a time

o Hence, alter schema appropriately

● Eliminate inefficiencies on the systemo Make reading from disk less wasteful,

maximise I/O capabilities: reduce readahead settings

Page 35: Tales from the Field

Schema After (document per-locale):

{ _id: "375-en_US",

name : ..., description : ..., <etc...> }

{ _id: "375-en_GB",

name : ..., description : ..., <etc...> }

{ _id: "375-fr_FR",

name : ..., description : ..., <etc...> }

... and so on for other locales ...

Query After:db.catalog.find( { _id : "375-en_US" };db.catalog.find( { _id : "375-fr_FR" };db.catalog.find( { _id : "375-de_DE" };

Schema: Before & AfterSchema Before (embedded):

{ _id: 375

en_US : { name : ..., description : ...,

<etc...> },

en_GB : { name : ..., description : ...,

<etc...> },

fr_FR : { name : ..., description : ...,

<etc...> },

<... and so on for other locales... >

}

Query Before:db.catalog.find( { _id : 375 } , { en_US : true } );db.catalog.find( { _id : 375 } , { fr_FR : true } );db.catalog.find( { _id : 375 } , { de_DE : true } );

Page 36: Tales from the Field

Consequences of Changes

● Queries induced minimal overhead● Greater than 20x distinct products fit in memory

at once● Disk I/O utilization reduced● UI latency decreased● Happier Customers● Profit (well, we hope)

Page 37: Tales from the Field

Conclusions

● MongoDB can be used to for a wide range of (sometimes pretty cool) use cases

● A small problem can seem much bigger when it happens in production

● We are here to help - if you hit a problem, it’s likely you are not the first to hit it

● We can provide a fresh perspective, advice based on experience to prevent and solve issues

Page 38: Tales from the Field

Adam ComerfordSenior Solutions Engineer, MongoDB

@comerford #MongoDBLondon

Questions?

Page 39: Tales from the Field

Further Reading for Retail/Catalogs● Antoine Girbal (my team mate) has produced a full reference

architecture for this type of application

o Blog part 1: http://tmblr.co/ZiOADx1RRsAWe

o Blog part 2: http://tmblr.co/ZiOADx1LfVmfm

● Detailed presentations and talks from MongoDB World:

o http://www.mongodb.com/presentations/retail-reference-architecture-part-1-flexible-searchable-low-latency-product-catalog

o http://www.mongodb.com/presentations/retail-reference-architecture-part-2-real-time-geo-distributed-inventory

o http://www.mongodb.com/presentations/retail-reference-architecture-part-3-scalable-insight-component-providing-user-history