mongouk 2011 schema design

52
Alvin Richards [email protected] Schema Design

Upload: alvin-john-richards

Post on 03-Mar-2015

144 views

Category:

Documents


0 download

DESCRIPTION

Schema design... how to I represent 1:1, 1:M and trees in JSON? In this talk we aim to explain schema design in MongoDB.

TRANSCRIPT

Page 1: MongoUK 2011 Schema Design

Alvin  Richards  -­‐  [email protected]

Schema Design

Page 2: MongoUK 2011 Schema Design

Topics

Introduction• Basic Data Modeling• Evolving a schema

Common patterns• Single table inheritance• One-to-Many & Many-to-Many• Trees• Queues

Page 3: MongoUK 2011 Schema Design

So why model data?

http://www.flickr.com/photos/42304632@N00/493639870/

Page 4: MongoUK 2011 Schema Design

A brief history of normalization• 1970 E.F.Codd introduces 1st Normal Form (1NF)• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)

Goals:• Avoid anomalies when inserting, updating or deleting• Minimize redesign when extending the schema• Make the model informative to users• Avoid bias towards a particular style of query

* source : wikipedia

Page 5: MongoUK 2011 Schema Design

The real benefit of relational

• Before relational• Data and Logic combined

• After relational• Separation of concerns• Data modeled independent of logic• Logic freed from concerns of data design

• MongoDB continues this separation

Page 6: MongoUK 2011 Schema Design

Relational made normalized data look like this

Page 7: MongoUK 2011 Schema Design

Document databases make normalized data look like this

Page 8: MongoUK 2011 Schema Design

Terminology

RDBMS MongoDB

Table Collection

Row(s) JSON  Document

Index Index

Join Embedding  &  Linking

Partition Shard

Partition  Key Shard  Key

Page 9: MongoUK 2011 Schema Design

So today’s example will use...

Page 10: MongoUK 2011 Schema Design

Design Session

Design documents that simply map to your application>  post  =  {author:  "Hergé",                    date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),                    text:  "Destination  Moon",                    tags:  ["comic",  "adventure"]}

>  db.posts.save(post)

Page 11: MongoUK 2011 Schema Design

>  db.posts.find()

   {  _id:  ObjectId("4c4ba5c0672c685e5e8aabf3"),        author:  "Hergé",          date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          text:  "Destination  Moon",          tags:  [  "comic",  "adventure"  ]    }     Notes:• ID must be unique, but can be anything you’d like• MongoDB will generate a default ID if one is not supplied

Find the document

Page 12: MongoUK 2011 Schema Design

Secondary index for “author”

 //      1  means  ascending,  -­‐1  means  descending

 >  db.posts.ensureIndex({author:  1})

 >  db.posts.find({author:  'Hergé'})          {  _id:  ObjectId("4c4ba5c0672c685e5e8aabf3"),          date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          author:  "Hergé",            ...  }

Add and index, find via Index

Page 13: MongoUK 2011 Schema Design

Examine the query plan>  db.blogs.find({author:  'Hergé'}).explain(){   "cursor"  :  "BtreeCursor  author_1",   "nscanned"  :  1,   "nscannedObjects"  :  1,   "n"  :  1,   "millis"  :  5,   "indexBounds"  :  {     "author"  :  [       [         "Hergé",         "Hergé"       ]     ]   }}

Page 14: MongoUK 2011 Schema Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Page 15: MongoUK 2011 Schema Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Regular expressions://  posts  where  author  starts  with  h>  db.posts.find({author:  /^h/i  })  

Page 16: MongoUK 2011 Schema Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Regular expressions://  posts  where  author  starts  with  h>  db.posts.find({author:  /^h/i  })  

Counting: //  number  of  posts  written  by  Hergé>  db.posts.find({author:  “Hergé”}).count()

Page 17: MongoUK 2011 Schema Design

Extending the Schema        new_comment  =  {author:  “Kyle”,                                  date:  new  Date(),                                text:  “great  book”}

 >  db.posts.update(                      {text:  “Destination  Moon”  },                        {  ‘$push’:  {comments:  new_comment},                          ‘$inc’:    {comments_count:  1}})

Page 18: MongoUK 2011 Schema Design

     {  _id  :  ObjectId("4c4ba5c0672c685e5e8aabf3"),          author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          text  :  "Destination  Moon",        tags  :  [  "comic",  "adventure"  ],                comments  :  [   {     author  :  "Kyle",     date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z"),     text  :  "great  book"   }        ],        comments_count:  1    }    

Extending the Schema

Page 19: MongoUK 2011 Schema Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({comments.author:”Kyle”})

Extending the Schema

Page 20: MongoUK 2011 Schema Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({comments.author:”Kyle”})

//  find  last  5  posts:>  db.posts.find().sort({date:-­‐1}).limit(5)

Extending the Schema

Page 21: MongoUK 2011 Schema Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({comments.author:”Kyle”})

//  find  last  5  posts:>  db.posts.find().sort({date:-­‐1}).limit(5)

//  most  commented  post:>  db.posts.find().sort({comments_count:-­‐1}).limit(1)

When sorting, check if you need an index

Extending the Schema

Page 22: MongoUK 2011 Schema Design

Watch for full table scans

>  db.blogs.find({text:  'Destination  Moon'}).explain()    {   "cursor"  :  "BasicCursor",   "nscanned"  :  1,   "nscannedObjects"  :  1,   "n"  :  1,   "millis"  :  0,   "indexBounds"  :  {       }}

Page 23: MongoUK 2011 Schema Design

Group

• Equivalent to a Group By in SQL

• Specific the attributes to group the data

• Process the results in a Reduce function

Page 24: MongoUK 2011 Schema Design

Group - Count post by Authorcmd  =  {  key:  {  "author":true  },                initial:  {count:  0},                reduce:  function(obj,  prev)  {                                prev.count++;                            },            };result  =  db.posts.group(cmd);

[   {     "author"  :  "Hergé",     "count"  :  1   },   {     "author"  :  "Kyle",     "count"  :  3   }]

Page 25: MongoUK 2011 Schema Design

Common Patterns

Page 26: MongoUK 2011 Schema Design

Inheritance

Page 27: MongoUK 2011 Schema Design

Single Table Inheritance - RDBMS

shapes tableid type area radius d length width

1 circle 3.14 1

2 square 4 2

3 rect 10 5 2

Page 28: MongoUK 2011 Schema Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

Page 29: MongoUK 2011 Schema Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  >  db.shapes.find({radius:  {$gt:  0}})

Page 30: MongoUK 2011 Schema Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  d:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  >  db.shapes.find({radius:  {$gt:  0}})

//  create  index>  db.shapes.ensureIndex({radius:  1})

Page 31: MongoUK 2011 Schema Design

One to ManyOne to Many relationships can specify• degree of association between objects• containment• life-cycle

Page 32: MongoUK 2011 Schema Design

One to Many- Embedded Array / Array Keys - slice operator to return subset of array - some queries harder e.g find latest comments across all documents

blogs:  {                author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          comments  :  [      {     author  :  "Kyle",     date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z"),     text  :  "great  book"      }        ]}

Page 33: MongoUK 2011 Schema Design

One to Many- Embedded tree - Single document - Natural - Hard to query

blogs:  {                author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          comments  :  [      {     author  :  "Kyle",     date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z"),     text  :  "great  book",                replies:  [  {  author  :  “James”,  ...}  ]      }        ]}

Page 34: MongoUK 2011 Schema Design

One to Many- Normalized (2 collections) - most flexible - more queries

blogs:  {                author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          comments  :  [        {comment  :  ObjectId(“1”)}        ]}

comments  :  {  _id  :  “1”,                          author  :  "James",              date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z")}

Page 35: MongoUK 2011 Schema Design

One to Many - patterns

- Embedded Array / Array Keys

- Embedded Array / Array Keys- Embedded tree- Normalized

Page 36: MongoUK 2011 Schema Design

Many - ManyExample: - Product can be in many categories- Category can have many products

Page 37: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    

Many - Many

Page 38: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    categories:      {  _id:  ObjectId("20"),            name:  "adventure",            product_ids:  [  ObjectId("10"),                                        ObjectId("11"),                                        ObjectId("12"]}

Many - Many

Page 39: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    categories:      {  _id:  ObjectId("20"),            name:  "adventure",            product_ids:  [  ObjectId("10"),                                        ObjectId("11"),                                        ObjectId("12"]}

//All  categories  for  a  given  product>  db.categories.find({product_ids:  ObjectId("10")})

Many - Many

Page 40: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    categories:      {  _id:  ObjectId("20"),            name:  "adventure"}

Alternative

Page 41: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    categories:      {  _id:  ObjectId("20"),            name:  "adventure"}

//  All  products  for  a  given  category>  db.products.find({category_ids:  ObjectId("20")})  

Alternative

Page 42: MongoUK 2011 Schema Design

products:      {  _id:  ObjectId("10"),          name:  "Destination  Moon",          category_ids:  [  ObjectId("20"),                                          ObjectId("30”]}    categories:      {  _id:  ObjectId("20"),            name:  "adventure"}

//  All  products  for  a  given  category>  db.products.find({category_ids:  ObjectId("20")})  

//  All  categories  for  a  given  productproduct    =  db.products.find(_id  :  some_id)>  db.categories.find({_id  :  {$in  :  product.category_ids}})  

Alternative

Page 43: MongoUK 2011 Schema Design

TreesFull Tree in Document

{  comments:  [          {  author:  “Kyle”,  text:  “...”,                replies:  [                                            {author:  “James”,  text:  “...”,                                              replies:  []}                ]}    ]}

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 16MB limit

   

Page 44: MongoUK 2011 Schema Design

TreesParent Links- Each node is stored as a document- Contains the id of the parent

Child Links- Each node contains the id’s of the children- Can support graphs (multiple parents / child)

Page 45: MongoUK 2011 Schema Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

Page 46: MongoUK 2011 Schema Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

//find  all  descendants  of  b:

>  db.tree2.find({ancestors:  ‘b’})

//find  all  direct  descendants  of  b:

>  db.tree2.find({parent:  ‘b’})

Page 47: MongoUK 2011 Schema Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }    {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }    {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

//find  all  descendants  of  b:

>  db.tree2.find({ancestors:  ‘b’})

//find  all  direct  descendants  of  b:

>  db.tree2.find({parent:  ‘b’})

//find  all  ancestors  of  f:>  ancestors  =  db.tree2.findOne({_id:’f’}).ancestors>  db.tree2.find({_id:  {  $in  :  ancestors})

Page 48: MongoUK 2011 Schema Design

Trees as PathsStore hierarchy as a path expression- Separate each node by a delimiter, e.g. “/”- Use text search for find parts of a tree

{  comments:  [          {  author:  “Kyle”,  text:  “initial  post”,                path:  “/”  },          {  author:  “Jim”,    text:  “jim’s  comment”,              path:  “/jim”  },          {  author:  “Kyle”,  text:  “Kyle’s  reply  to  Jim”,              path  :  “/jim/kyle”}  ]  }

//  Find  the  conversations  Jim  was  part  of  >  db.posts.find({path:  /^jim/i})

Page 49: MongoUK 2011 Schema Design

Queue• Need to maintain order and state• Ensure that updates to the queue are atomic

     {  inprogress:  false,          priority:  1,        ...      }

Page 50: MongoUK 2011 Schema Design

Queue• Need to maintain order and state• Ensure that updates to the queue are atomic

     {  inprogress:  false,          priority:  1,        ...      }

//  find  highest  priority  job  and  mark  as  in-­‐progressjob  =  db.jobs.findAndModify({                              query:    {inprogress:  false},                              sort:      {priority:  -­‐1},                                update:  {$set:  {inprogress:  true,                                                                started:  new  Date()}},                              new:  true})    

Page 51: MongoUK 2011 Schema Design

Summary

Schema design is different in MongoDB

Basic data design principals stay the same

Focus on how the apps manipulates data

Rapidly evolve schema to meet your requirements

Enjoy your new freedom, use it wisely :-)

Page 52: MongoUK 2011 Schema Design

@mongodb

conferences,  appearances,  and  meetupshttp://www.10gen.com/events

http://bit.ly/mongo1  Facebook                    |                  Twitter                  |                  LinkedIn

http://linkd.in/joinmongo

download at mongodb.org

[email protected]