nosql and data modeling for data modelers

23
Big Data, NoSQL & Data Modeling 10 Tips for Data Modeling Success on Modern Data Projects Karen Lopez, InfoAdvisors www.datamodel.com

Upload: karen-lopez

Post on 12-Jul-2015

841 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: NoSQL and Data Modeling for Data Modelers

Big Data, NoSQL & Data Modeling

10 Tips for Data Modeling Success on Modern Data Projects

Karen Lopez, InfoAdvisorswww.datamodel.com

Page 2: NoSQL and Data Modeling for Data Modelers

Data Models – Traditional Process

Conceptual (Data) Model

Logical Data Model

Physical Data

Model(s) OLTP

OLTPOLTP OLTP

OLTP

MARTMART

OLTP

OLTPOLTP

Aug 2014©InfoAdvisors - infoadvisors.com

Page 3: NoSQL and Data Modeling for Data Modelers

Relational

Aug 2014©InfoAdvisors - infoadvisors.com

Data Models started

with relational

modeling, so they look

like relational database

structures.

Page 4: NoSQL and Data Modeling for Data Modelers

But….

That doesn’t mean they can’t be used to model data that goes into a non-relational format.

All that formatting happens at build OR consumption time, not requirements time.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 5: NoSQL and Data Modeling for Data Modelers

The Big Data Story

Lots of data

Coming at us fast

Lots of variety in format & quality

We want all the data

Highly available

“It’s web scale”Aug 2014©InfoAdvisors - infoadvisors.com

Page 6: NoSQL and Data Modeling for Data Modelers

What do we really mean by scale?

Bringing computing to the data

Massively parallel processing

Cheap, commodity hardware, but lots of it

Optimized for Query/Reads/Questions/Telling stories

Aug 2014©InfoAdvisors - infoadvisors.com

Page 7: NoSQL and Data Modeling for Data Modelers

We’ve been down this road before…

Traditional transactional applications

Reporting-optimized

tables/structures

Data Warehouse / Dimensional

Modeling

Aug 2014©InfoAdvisors - infoadvisors.com

Highly normalized Highly Denormalized

Page 8: NoSQL and Data Modeling for Data Modelers

ETL

EDW

Data Mart

Data Mart

Page 9: NoSQL and Data Modeling for Data Modelers

Hadoop

ETL

EDW

Analytics Mart

Data Mart

Page 10: NoSQL and Data Modeling for Data Modelers

NoSQL, Not Only SQL

Relational GraphColumnar/Column

Family

Key ValueDocument Databases

Others

Aug 2014©InfoAdvisors - infoadvisors.com

Page 11: NoSQL and Data Modeling for Data Modelers

Sample Hive Statement

CREATE EXTERNAL TABLE TaxRebateUsage (

state string,

zipcode string,

agi_class int,

n1 int,

mars2 int,

prep int,

n2 int,

)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE

Aug 2014©InfoAdvisors - infoadvisors.com

Page 12: NoSQL and Data Modeling for Data Modelers

Sample JSON/MongoDB Notation

Aug 2014©InfoAdvisors - infoadvisors.com

Page 13: NoSQL and Data Modeling for Data Modelers

Sample FoundationDB Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 14: NoSQL and Data Modeling for Data Modelers

Sample Cassandra Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 15: NoSQL and Data Modeling for Data Modelers

Sample Vertica Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 16: NoSQL and Data Modeling for Data Modelers

Sample Neo4j Statement

Aug 2014©InfoAdvisors - infoadvisors.com

Page 17: NoSQL and Data Modeling for Data Modelers

Those weren’t SCHEMALESS….

They had data facts, which had meanings. And sometimes expected formats, precisions, and types.

In the NoSQL world, we don’t apply those necessarily at write time, but at read time.

SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic) or VARYING SCHEMAs.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 18: NoSQL and Data Modeling for Data Modelers

The Big Data Big Lies

Schemaless

• Schema on Read, not Schema on Write

• Polyschematic

Big

• New data stories

• New technologies

• Not just volume

Aug 2014©InfoAdvisors - infoadvisors.com

Page 19: NoSQL and Data Modeling for Data Modelers

10 Tips For Modeling in a Hybrid World

1. Models require a modeler

2. Data modeling tools are essential

3. There are many types of data models: know which ones you need

4. Modeling does not have to happen at the same time in every project. It should happen at the right time

5. Modeling is not just schema design. Think outside the boxes and lines

Aug 2014©InfoAdvisors - infoadvisors.com

Page 20: NoSQL and Data Modeling for Data Modelers

10 Tips for Modeling in a Hybrid World

6. A data model is much more than a diagram

7. You will need training.

8. Team members may not understand modeling. They will need training

9. NoSQL is not one thing. Learn many patterns

10.Modern data architectures are likely hybrid solutions. You can’t just support one part.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 21: NoSQL and Data Modeling for Data Modelers

What does this mean for data modelers?

There will be jobs for traditional, ERD, relational modelers….

….just like there are still jobs of RPG and COBOL programmers

All data has a data story. Many data stories.

A good modeler is a an architect at heart – finding the right solution for the data story.

Aug 2014©InfoAdvisors - infoadvisors.com

Page 22: NoSQL and Data Modeling for Data Modelers

Business Intelligence Journal

Look for September 2014 Issue Article on Modern

Data Architectures

Aug 2014©InfoAdvisors - infoadvisors.com

Page 23: NoSQL and Data Modeling for Data Modelers

Thank You!

www.infoadvisors.com

www.datamodel.com

www.dataversity.net

community.embarcadero.com

#TEAMDATA

Aug 2014©InfoAdvisors - infoadvisors.com