nosql – data center centric application enablement

25
Data Centers and NoSQL Database Robert Greene Product Management

Upload: dataversity

Post on 20-Aug-2015

513 views

Category:

Technology


2 download

TRANSCRIPT

Data Centers and NoSQL DatabaseRobert Greene – Product Management

Data Center Trends and Business Drivers

Key Tech Requirements

The Data Center Aware Application

3 NoSQL Data Center Architectures

Data Locality and Reliability Isolation

Data Consistency Considerations

NoSQL Data Center Implementation

Summary - NoSQL & Data Centers

NoSQL - Enabling Data Center Applications

Data Centers - Trends and Drivers

Trends – More deployments

• SMB’s going global

• Consumer facing

• Terabytes of Data

• Proximity and Rev

• Requirements

• Security

• Regulatory

• Availability

IDC survey revenue>$1B or employees>5000

• 26% have 6 or more DC’s

Cloud deployments

• SMB’s going global

• Consumer facing

• Terabytes of Data

• Proximity and Rev

• Requirements

• Security

• Regulatory

• Availability

Data Centers - Trends and Drivers

Business Drivers

• Cost Optimization

• SaaS and Desktop Virtualization

• Globalized Sensor Data and M2M

• Big Data – Physics of IT

• Revenue

• Latency and Availability

Data Centers - Trends and Drivers

Latency reduction

• Amazon study

• 100ms = 1% Revenue

• 150M dollars

• SMB seeing more data

• Physics

• More data = more time,

• Longer distance = more time

Key Technical Requirements

Availability

• 2003 New York Blackout

• Dec 2012 AWS Outage

• Regulatory Mandates

Key Technical Requirements

Process & Data Redundancy

• Specialized Needs

• Provisioning

• Placement

• Communications

• Encryption

• Security

• Data Availability

• Upgrade processing

• Failover processes

The Data Center Aware Application

3 Prevalent Architectures

• Container

• Component

• Tag

NoSQL Data Center Architectures

Data

Replica

Replica

Data Center 1

Data Center 2

Data Center 3

NoSQL Data Center Deployment Architectures

Node

Data Replica ReplicaData Replica Replica

Hardware

Unit of Replication

Process

Container Based

Strengths

• Allows Local Writes

• Simple Replication Admin

• Good Global Availability

Weakness

• Many Data Copies

• High Bandwidth Rep

• No Consistent Reads

Durability & ConsistencyDurability & Consistency

Replication

More

Sites

NoSQL Data Center Deployment Architectures

Data Replica Replica

Hardware

Unit of Replication

Process

Component Based

Strengths

• Minimal Data Copies

• Low Bandwidth Rep

• Good Regional Availability

• Simple Admin

Weakness

• Network Latency

Sensitive

• Data placement

Durability & ConsistencySync Channel

Strengths

• Complete Control

• Targeted Data

NoSQL Data Center Deployment Architectures

Node

Data Replica Replica Data Replica Replica

Hardware

Unit of Replication

Process

Data Replica Replica

TAG_A TAG_ATAG_A

Tag Based

Weakness

• Coding Specific

• Brittle to system change

• Complex management

• Non-optimal data copies

Durability & Consistency Durability & Consistency

Sync Channel

Hybrid Architectures

• Component (local reliability)

• Container (multi channel)

Cloud Architectures

• Model Physical Infrastructure

• Geographic Regions

• Zoned local isolation points

• Power & Network only

NoSQL Data Center Architectures

Physical - Points of Failure Isolation

• Disk, Server, Rack, Power, Network Switches

Latency Placement Considerations

Data Locality and Reliability Isolation

RegionData Center

Zone 3Zone 2Zone 1

RD R DR R A

Net Switches

Servers

Disks

High/Low Latency

Racks

Net Switches Net Switches

Low Latency

Power

Multiple locations need to be kept in sync

Latency – Read Consistency across Data Centers

• Consistency based on eventually consistent processes

• How to ensure you have the latest data if needed

• How to read locally to keep latency low

Throughput – Write Durability across Data Centers

• Durability based on write copies, eventually everywhere

• How to get copies written without high latency

• How to resolve conflicting updates

Techniques: quorum voting, vector clocks, timestamp

Data Consistency Considerations

Container Architecture

• Reads and writes always local to the container

• No option to ensure consistent data

• Sync involves key differentiation scheme ( e.g. Merkle tree)

Component Architecture

• Reads and writes optionally to local component

• Can request consistent operation ( possible latency cost )

• Sync involves log ordered sequencing ( no differentiation )

Tagged Architecture

• Reads and writes explicit at the data level

• Can request consistent operation ( possible latency cost )

• Sync is application code dependent

Data Consistency Considerations

Review of a few well known vendors

• Cassandra

• Hbase

• MongoDB

• OnDB

• Riak

NoSQL Data Center Implementation

Cassandra Data Center Implementation

Data Center Architecture: Container

Replication Unit: Node

Reliability: Cluster

Data Copies: RF * Data Center

Durability: One, Local_Quorum

Consistency: Timestamp

Placement: Multiple copies per Data Center

Hbase Data Center Implementation

Data Center Architecture: Tag

Replication Unit: Column Family

Reliability: Cluster

Data Copies: RF x Clusters (family subset)

Durability: ACID

Consistency: Absolute

Placement: None

Extensions:

• Multi-Channel

Replication ( per

region server )

MongoDB Data Center Implementation

Data Center Architecture: Tag

Replication Unit: Range of Collection

Reliability: Replica Set

Data Copies: RF x Tagged Replica Set (range subset)

Durability: WriteConcern

Consistency: ReadConcern

Placement: Hard Coded

Extensions:

• Read-Only Shards

Riak Data Center Implementation

Data Center Architecture: Container

Replication Unit: Cluster

Reliability: Local Cluster

Data Copies: RF x Clusters

Durability: One, Quorum, All

Consistency: Vector Clock

Placement: Multiple copies per cluster

Extensions:

• Multi-Channel Sink

• Read-Only Cluster

Data Center Architecture: Component

Replication Unit: Zone

Reliability: Zone(s)

Data Copies: RF

Durability: ACID, One, Quorum, All

Consistency: Absolute, Quorum

Placement: Copy per Zone

Extensions:

• Local Quorum

• Read-Only Zones

OnDB – Oracle NoSQL DB Data Center

Data Replica Replica

Zone

Near Data Centers

Client ReadClient Write

Availability Group

Data Center Deployments

• Increasingly Common: Cost, Revenue & Reliability Advantages

The Data Center Aware Application ( latency & consistency )

Summary – NoSQL & Data Centers

Component X

Container X X

Tag X X

Near Data Centers X X XMulti Channel

XDisaster Recovery

Placement

Far Data Centers Placement X X Placement

Consistent Data X X

In-Consistent Data X X X

Data Size - Copies Low High High Low Medium

Oracle NoSQL DB Resources

• NoSQL Database Downloads

http://www.oracle.com/technetwork/products/nosqldb/downloads/index.html

• NoSQL Database Documentation

http://www.oracle.com/technetwork/products/nosqldb/documentation/index.html

• NoSQL Database Contacts

• David Segleau – Director Product Management [email protected]

• Robert Greene – Senior Principle Product Manager [email protected]