understanding and tuning wiredtiger, the new high performance database engine in mongodb / henrik...

31
Understanding and tuning WiredTiger the new high performance database engine in MongoDB Henrik Ingo Solutions Architect, MongoDB

Upload: ontico

Post on 06-Jan-2017

1.618 views

Category:

Engineering


10 download

TRANSCRIPT

Page 1: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Understanding and tuning WiredTigerthe new high performance database engine in MongoDB

Henrik IngoSolutions Architect, MongoDB

Page 2: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Agenda:

- MongoDB and NoSQL - Storage Engine API - WiredTiger configuration + performance

Page 3: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

3

Most popular NoSQL database

Page 4: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

4

5 NoSQL categories

Key Value Wide Column Document

Graph Map Reduce

Redis, Riak Cassandra

Neo4j Hadoop

Page 5: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

5

MongoDB is a Document Database

MongoDBRich Queries

• Find Paul’s cars• Find everybody in London with a car

built between 1970 and 1980

Geospatial • Find all of the car owners within 5km of Trafalgar Sq.

Text Search • Find all the cars described as having leather seats

Aggregation • Calculate the average value of Paul’s car collection

Map Reduce• What is the ownership pattern of colors

by geography over time? (is purple trending up in China?)

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

Page 6: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

6

Operational Database Landscape

Page 7: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

MongoDB 3.0 & storage engines

Page 8: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

8

MongoDB until 3.0

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Multi statement transactions• In-memory engine• SSD optimized engine• etc...

Page 9: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

9

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

How to get all of the above?

Page 10: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

10

MongoDB 3.0 Storage Engine API

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

Page 11: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

11

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

• One at a time:– Many engines built into mongod– Choose 1 at startup– All data stored by the same engine– Incompatible on-disk data formats (obviously)– Compatible client API

• Compatible Oplog & Replication– Same replica set can mix different engines– No-downtime migration possible

MongoDB 3.0 Storage Engine API

Page 12: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

12

• MMAPv1– Improved MMAP (collection-level locking)

• WiredTiger– Discussed next

• RocksDB– LSM style engine developed by Facebook– Based on LevelDB

• TokuMXse– Fractal Tree indexing engine from Percona

Some existing engines

Page 13: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

13

• Heap– In-memory engine

• Devnull– Write all data to /dev/null– Based on idea from famous flash animation...

• SSD optimized engine (e.g. Fusion-IO)• KV simple key-value engine

Some rumored engines

https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage

Page 14: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

WiredTiger

Page 15: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

15

• Modern NoSQL database engine– flexible schema

• Advanced database engine– Secondary indexes, MVCC, non-locking algorithms– Multi-statement transactions (not in MongoDB)

• Very modular, tunable– Btree, LSM and columnar indexes– Snappy, Zlib, 3rd-party compression– Index prefix compression, etc...– Encryption at rest

• Built by creators of BerkeleyDB• Acquired by MongoDB in 2014• source.wiredtiger.com, @WiredTigerInc

What is WiredTiger

Page 16: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

16

Choosing WiredTiger at server startup

mongod --storageEngine wiredTiger

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Default engine:MongoDB 3.0 = MMAP

MongoDB 3.2 = WiredTiger

Page 17: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

17

Main tunables exposed as MongoDB options

mongod --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --wiredTigerDirectoryForIndexes /data/indexes --wiredTigerCollectionBlockCompressor zlib --dbpath /data/datafiles

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

Page 18: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

18

All WiredTiger options via configString (hidden)

mongod --storageEngine wiredTiger --wiredTigerEngineConfigString "cache_size=8GB,eviction=(threads_min=4,threads_max=8), checkpoint(wait=30)"

--wiredTigerCollectionConfigString "block_compressor=zlib"

--wiredTigerIndexConfigString "type=lsm,block_compressor=zlib" --wiredTigerDirectoryForIndexes /data/indexes

See docs for wiredtiger_open() & WT_SESSION::create()http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840edhttp://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb

Page 19: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

19

Also via createCollection(), createIndex()

db.createCollection( "users", { storageEngine: { wiredTiger: { configString: "block_compressor=none" } } )

http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollectionhttp://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex

Page 20: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

20

• db.serverStatus()• db.collection.stats()

More...

Page 21: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

Understanding and OptimizingWiredTiger

Page 22: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

22

Understanding WiredTiger architectureW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Page 23: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

23

Covering 90% of your optimization needsW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Decompression time

Disk seek time

Page 24: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

24

Strategy 1: fit working set in CacheW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 80%

Page 25: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

25

Strategy 2: fit working set in OS Disk CacheW

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 10%

OS Disk Cache (Remaining: 90%)

Page 26: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

26

Strategy 3: SSD disk + compression to save €W

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 27: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

27

Strategy 4: SSD disk (no compression)W

iredT

iger

SE

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

Page 28: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

28

Compression benchmarks

Page 29: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

29

What problem is solved by LSM indexes?P

erfo

rman

ce

Fast reads Fast writesBoth

Easy: Add indexes

Easy: No indexes

Hard: Smart schema design (hire a consultant) LSM index structures (or columnar)

Page 30: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

30

2B inserts (with 3 secondary indexes)

http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html

Page 31: Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)