tag based sharding presentation

23
Tag-based sharding Distribute your data as you need MongoDB User Group Madrid, October 13th 2015 Juan Antonio Roy Couto

Upload: juan-antonio-roy-couto

Post on 08-Apr-2017

1.409 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Tag based sharding presentation

Tag-based shardingDistribute your data as you need

MongoDB User GroupMadrid, October 13th 2015

Juan Antonio Roy Couto

Page 2: Tag based sharding presentation

Who am I?

Juan Antonio Roy Couto

Financial Software Developer

Twitter: @juanroycouto

Linkedin: https://www.linkedin.com/in/juanroycouto

Personal blog: http://www.juanroy.es

Contributor at: http://www.mongodbspain.com

Charrosfera member: http://www.charrosfera.com

Email: [email protected]

MongoDB User GroupTag-based sharding

Page 3: Tag based sharding presentation

❏ Cluster overview❏ Definitions❏ Steps for balancing❏ Steps to split a chunk❏ Migration steps❏ Normal MongoDB operation❏ Pre-splitting❏ Commands to split a chunk❏ Tag-based sharding overview❏ Tag your shards❏ Tag your chunk ranges

Table of ContentsMongoDB User GroupTag-based sharding

Page 4: Tag based sharding presentation

❏ Replica set❏ Shards❏ config servers❏ config database❏ mongos

Cluster overviewMongoDB User GroupTag-based sharding

Page 5: Tag based sharding presentation

Cluster overviewReplica Set

● High availability

● Data safety

● Disaster recovery

MongoDB User GroupTag-based sharding

Replica Set

Secondary

Secondary

Primary

Page 6: Tag based sharding presentation

Scale out

Even data distribution across all of the

shards based on a shardkey

A shardkey range belongs to only one

shard

More efficient queries

Cluster overviewShards

MongoDB User GroupTag-based sharding

Cluster

Shard 0 Shard 2Shard 1

A-I J-Q R-Z

Page 7: Tag based sharding presentation

Cluster overviewMongoDB User GroupTag-based sharding

Page 8: Tag based sharding presentation

Cluster overviewConfig servers

MongoDB User GroupTag-based sharding

● config database

● Identical information (consistency check).

● Metadata:

○ Cluster shards list

○ Data per shard (chunk ranges)

○ ...

● Don’t sync from each other.

● Default Config server (All mongos read it)

Page 9: Tag based sharding presentation

Cluster overviewconfig database

Collections:

● changelog: splits and migration information● chunks *● collections * (only sharded)● databases *● lockpings● locks● mongos● settings● shards *● system.indexes● tags● version *

* Checked for consistency

MongoDB User GroupTag-based sharding

Page 10: Tag based sharding presentation

● Receives client requests and returns results.

● Reads the metadata and sends the query to the necessary

shard/shards

● Does not store data

● Keeps a cache version of the metadata. We can refresh it by:

○ mongos>db.runCommand( { flushRouterConfig : 1 } )

○ or restarting the server

Cluster overviewmongos

MongoDB User GroupTag-based sharding

Page 11: Tag based sharding presentation

MongoDB User GroupTag-based shardingDefinitions

● Range: Data division based on the values of the shardkey.

● Chunk: They are not physical data. Chunks are just a logical

grouping of data into ranges (64MB by default).

● Split: Chunk division. No data is moved.

● Migration: Chunk movements between shards in order to get an

even distribution. Only one chunk is moved at a time.

● Balanced system: The same number of chunks per shard.

● Balancer: Checks if a migration is needed and starts it.

Page 12: Tag based sharding presentation

MongoDB User GroupTag-based sharding

Split

Migration

Steps for balancing

Page 13: Tag based sharding presentation

Steps to split a chunkMongoDB User GroupTag-based sharding

Shard 0Final

Shard 0

Chunk 1 Chunk 1

Chunk 2

Chunk 2

Chunk 3

Config server

mongos

1. Needs chunk 2 to be splitted?

2. Split pointslist

3. Updatemetadata

4. Refreshcache

Page 14: Tag based sharding presentation

Migration stepsMongoDB User GroupTag-based sharding

Shard 1 (To)

Shard 0 (From)

Chunk 4

Chunk 1

Chunk 2

Chunk 3

Config server

mongos

1. Is balancer running?2. Is the system imbalance?3. Pick chunk 34. Split chunk 3?5. Begin (1)

6. Deletes finished?(2)

7. Read this chunk(3)

8. Transfer(4)

9. Update metadata(5)

10. Remove chunk 3 from shard 0(6)

11. Refresh mongos cache Chunk 3

1

4

78911

Page 15: Tag based sharding presentation

Normal MongoDB operationMongoDB User GroupTag-based sharding

Shard 0 Shard 1 Shard 2 Shard 3

mongosClient

Migrations

Page 16: Tag based sharding presentation

Useful for storing data directly in the shards (massive data loads).

Avoid bottlenecks.

MongoDB does not need to split or migrate chunks.

After the split, the migration must be finished before data loading.

Pre-splittingMongoDB User GroupTag-based sharding

Cluster

Shard 0 Shard 2Shard 1

Chunk 1

Chunk 5

Chunk 3

Chunk 4

Chunk 2

Page 17: Tag based sharding presentation

Splitting a chunk:

mongos>for (var i=0; i<20, i++) {

sh.splitAt(“testdb.presplit”, { x : 1000*i } );

}

Querying existing chunks:

mongos>use config

mongos>db.chunks.find( { ns : “testdb.presplit” } )

Commands to split a chunkMongoDB User GroupTag-based sharding

Page 18: Tag based sharding presentation

Tags are used when you want to pin ranges to a specific shard.

Tag-based sharding overviewMongoDB User GroupTag-based sharding

shard0

EMEA

shard1

APAC

shard2

LATAM

shard3

NORAM

Page 19: Tag based sharding presentation

mongos>sh.addShardTag(“shard0”, “EMEA”)

mongos>sh.addShardTag(“shard1”, “APAC”)

mongos>sh.addShardTag(“shard2”, “LATAM”)

mongos>sh.addShardTag(“shard3”, “NORAM”)

Tag your shardsMongoDB User GroupTag-based sharding

Page 20: Tag based sharding presentation

mongos>sh.addTagRange( namespace, minimum, maximum, tag )

mongos>sh.addTagRange( “testdb.tagrange”,

{ “x” : 0 },

{ “x” : 1000 },

“EMEA” )

minimum: the minimum value (inclusive) of the shard key range to include in the tag.

maximum: the maximum value (exclusive) of the shard key range to include in the tag.

Tag your chunk rangesMongoDB User GroupTag-based sharding

Page 21: Tag based sharding presentation

Questions?

Questions?MongoDB User GroupTag-based sharding

Page 22: Tag based sharding presentation

We are looking for writers

mongodbspain.comMongoDB User GroupTag-based sharding

Page 23: Tag based sharding presentation

Thank you for your attention!

MongoDB User GroupTag-based shardingMadrid, October 13th 2015

Juan Antonio Roy Couto