schema design

42
Schema Design Software Engineer, MongoDB Craig Wilson #MongoDBDays @craiggwilson

Upload: mongodb

Post on 14-Dec-2014

420 views

Category:

Technology


3 download

DESCRIPTION

Schema Design with Craig Wilson

TRANSCRIPT

Page 1: Schema Design

Schema Design

Software Engineer, MongoDB

Craig Wilson

#MongoDBDays

@craiggwilson

Page 2: Schema Design

All application development isSchema Design

Page 3: Schema Design

Success comes from aProper Data Structure

Page 4: Schema Design

Terminology

RDBMS MongoDB

Database ➜ Database

Table ➜ Collection

Row ➜ Document

Index ➜ Index

Join ➜ Embedding & Linking

Page 5: Schema Design

Working with Documents

Page 6: Schema Design

{ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { _id: "kchodorow", name: "Kristina Chodorow“ }, { _id: "mdirold", name: “Mike Dirolf“ } ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

What is a Document?

Page 7: Schema Design

Traditional Schema DesignFocus on Data Storage

Page 8: Schema Design

Document Schema Design

Focus on Data Usage

Page 9: Schema Design

Traditional Schema DesignWhat answers do I have?

Page 10: Schema Design

Document Schema Design

What questions do I have?

Page 11: Schema Design

Schema Design By Example

Page 12: Schema Design

Library Management Application

• Patrons/Users

• Books

• Authors

• Publishers

Page 13: Schema Design

Question:What is a Patron’s Address?

Page 14: Schema Design

> patron = db.patrons.find({ _id : “joe” }){ _id: "joe“, name: "Joe Bookreader”}

> address = db.addresses.find({ _id : “joe” }){ _id: "joe“, street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

A Patron and their Address

Page 15: Schema Design

> patron = db.patrons.find({ _id : “joe” }){ _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}

A Patron and their Address

Page 16: Schema Design

One-to-One Relationships

• “Belongs to” relationships are often embedded.

• Holistic representation of entities with their embedded attributes and relationships.

• Optimized for read performance

Page 17: Schema Design

Question:What are a Patron’s Addresses?

Page 18: Schema Design

> patron = db.patrons.find({ _id : “bob” }){ _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ]}

A Patron and their Addresses

Page 19: Schema Design

> patron = db.patrons.find({ _id : “bob” }){ _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ]}

> patron = db.patrons.find({ _id : “joe” }){ _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", …}}

A Patron and their Addresses

Page 20: Schema Design

Migration Possibilities

• Migrate all documents when the schema changes.

• Migrate On-Demand– As we pull up a patron’s document, we make the

change.– Any patrons that never come into the library never

get updated.

• Leave it alone– As long as the application knows about both

types…

Page 21: Schema Design

Question:Who is the publisher of this book?

Page 22: Schema Design

Book

MongoDB: The Definitive Guide,

By Kristina Chodorow and Mike Dirolf

Published: 9/24/2010

Pages: 216

Language: English

Publisher: O’Reilly Media, CA

Page 23: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Book with embedded Publisher

Page 24: Schema Design

Book with embedded Publisher

• Optimized for read performance of Books

• Other queries become difficult

Page 25: Schema Design

Question:Who are all the publishers in the system?

Page 26: Schema Design

> publishers = db.publishers.find(){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA"}{ _id: “penguin”, name: “Penguin”, founded: “1983”, location: “CA”}

All Publishers

Page 27: Schema Design

> book = db.books.find({ _id: “123” }){ _id: “123”, publisher_id: “oreilly”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

> db.publishers.find({ _id : book.publisher_id }){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA"}

Book with linked Publisher

Page 28: Schema Design

Question:What are all the books a publisher has published?

Page 29: Schema Design

> publisher = db.publishers.find({ _id : “oreilly” }){ _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA“, books: [“123”,…]}

> books = db.books.find({ _id: { $in : publisher.books } })

Publisher with linked Books

Page 30: Schema Design

Question:Who are the authors of a given book?

Page 31: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [“kchodorow”, “mdirolf”]}

> authors = db.authors.find({ _id : { $in : book.authors } })

{ _id: "kchodorow", name: "Kristina Chodorow”, hometown: … }{ _id: “mdirolf", name: “Mike Dirolf“, hometown: … }

Books with linked Authors

Page 32: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ]}

Books with linked Authors

Page 33: Schema Design

Question:What are all the books an author has written?

Page 34: Schema Design

> authors = db.authors.find({ _id : “kchodorow” }){ _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]}

Authors with linked Books

Page 35: Schema Design

> authors = db.authors.find({ _id : “kchodorow” }){ _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]}

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ]}

Links on both Authors and Books

Page 36: Schema Design

Linking vs. Embedding

• Embedding– Great for read performance– Writes can be slow– Data integrity needs to be managed

• Linking– Flexible– Data integrity is built-in– Work is done during reads

Page 37: Schema Design

Question:What are all the books about databases?

Page 38: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", category: “MongoDB”}

> categories = db.categories.find({ _id: “MongoDB” }){ _id: “MongoDB”, parent: “Databases”}

Categories as Documents

Page 39: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", categories: [“MongoDB”, “Databases”, “Programming”]}

> db.books.find({ categories: “Databases” })

Categories as an Array

Page 40: Schema Design

> book = db.books.find({ _id : “123” }){ _id: “123”, title: "MongoDB: The Definitive Guide", category: “Programming/Databases/MongoDB”}

> db.books.find({ category: ^Programming/Databases/* })

Categories as a Path

Page 41: Schema Design

Conclusion

• Schema design is different in MongoDB

• Basic data design principals stay the same

• Focus on how an application accesses/manipulates data

• Evolve the schema to meet requirements as they change

Page 42: Schema Design

Schema Design

Software Engineer, 10gen

Craig Wilson

#MongoDBDays

@craiggwilson