spread the database love with heterogeneous replication · ©continuent ltd 2017 heterogeneous...

Post on 12-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

©Continuent Ltd 2017

Spread the Database Love with Heterogeneous Replication

MC Brown, VP, Products

©Continuent Ltd 2017

Heterogeneous Replication is NOT

• Exporting and Importing Data

• Moving to a different database platform

• One Time Exports

• ETL

©Continuent Ltd 2017

Heterogeneous Replication IS

• Live, constant, low-latency movement of data

• For analytics

• For migration

• For upgrades

• For Caching

• Data/format matching

• Effective target reproduction

©Continuent Ltd 2017

Know Your Databases

©Continuent Ltd 2017

Not all Databases are Created Equal

• Transactional over non transactional

• Object Reference

• Rows

• Columns

• Documents

• Free text

• Unstructured

©Continuent Ltd 2017

Is that a record, a field, a row a column?

• Row of data?

• Collection of related tables?

• What does it look like as a document?

• What does a document look like as a row?

• Databases, tables, collections, objects, buckets…

©Continuent Ltd 2017

Related Tables or Document?

©Continuent Ltd 2017

Mapping DB Compatibility

RDBMS Columnar Store Document Database

Freetext/unstructured data store

RDBMS Vendor specific only

Vendor specific only

Field mappings only Application specific

Columnar Store Vendor specific only

Vendor specific only

Field mappings only Application specific

Document Database

Field mappings only

Field mappings only

Vendor specific only Application specific

Freetext/unstructured data store

Application specific Application specific Application specific Application specific

Vendor specific - i.e. unique data typesField mappings - how we map the data

App Specific - how the data is used

©Continuent Ltd 2017

Know Your Data

©Continuent Ltd 2017

Hetero Replication Challenges

• Effective data replication

• nothing lost or removed

• low latency

• Automatic mapping

• Data typing

• Indexing and native use

©Continuent Ltd 2017

Challenges: Data Typing

• Data types are not supported everywhere

• For some, the type does not matter

• Even if the type does matter, the format, precision, structure might be different

• Numbers, Dates, Strings, Compound Data Types all cause problems

©Continuent Ltd 2017

Extraction/Apply Rates

• Data extraction rates vary

• Data apply rates

• Different solution handle data loading at different rtes

• Rows-based extraction/bulk apply

• Bulk extraction/row apply

• Non-destructive

©Continuent Ltd 2017

Whats the solution?

©Continuent Ltd 2017

Replicator Needs

• Native, neutral format

• Ability to change, reformat, restructure information

• Standalone nature

• Two-way

• Handle impedance problem

©Continuent Ltd 2017

Guess What?

• Tungsten Replicator does this

• High Performance

• Flexible storage interchange format

• Built-in filtering

• Operates standalone

• Stop and restart

• Transactionally consistent

• Open Source

©Continuent Ltd 2017

Applying Data

©Continuent Ltd 2017

Native is Best, Batch an Alternative

• Native:

• Applying to JDBC

• Adapt JDBC Applier to construct statement

• Or apply a record to target using API

• Batch

• Use CSV for data interchange

• Call scripts to import

©Continuent Ltd 2017

How Batch Apply Works

Replicator

Service ora2vrTransactions from master

CSVFilesCSVFilesCSVFiles

StagingTablesStagingTablesStagingTables

Base Tables

Base Tables

Base Tables

Merge Script

(or)COPY

directly to base tables

COPY to stage tables SELECT to

base tables

©Continuent Ltd 2017

How Batch Apply Works

• Works on one table at a time

• Five functions in JavaScript

• Prepare - Run when going online

• Begin - Start of transaction

• Apply - During transaction

• Commit - After transaction

• Release - When going offline

©Continuent Ltd 2017

During a transaction

• Copy, import, load the CSV

• Have access to column, key and transaction information

• Merge the data

• Delete and Insert, or

• Delete, Update and Insert

• Done

©Continuent Ltd 2017

Case Study: Cassandra/CQL

• Load table data:

• COPY staging_tablename (optype,seqno,uniqno,id,message) from ‘FILENAME’

• Delete:

• delete from sample where id in (#{deleteidlist})

• Insert:

• insert into sample ("+collist+") values ("+substlist+")

©Continuent Ltd 2017

Filters

©Continuent Ltd 2017

Filter Execution

Extract Filter Apply

StageExtract Filter Apply

Stage

MySQLMaster

TransactionHistory Log

In-MemoryQueue

Slave ReplicatorsBinlog

tcp/ip

©Continuent Ltd 2017

Filter Operation

• Always get one transaction at a time

• Transaction must be processed inline

• Metadata

• Data blocks

• SQL or ROW Info

• Always returns the transaction

©Continuent Ltd 2017

JS Filters

• prepare() - called when going online

• filter() - does the work

• release() - called when going offline

• Access to:

• Connection to DB

• Full Java class environment

• Bunch of utility functions

©Continuent Ltd 2017

Data Structure

ReplDBMSEvent DBMSData StatementData

DBMSData StatementData

DBMSData RowChangeData OneRowChange

OneRowChange

...

StatementData

ReplDBMSEvent DBMSData RowChangeData OneRowChange

OneRowChange

...

©Continuent Ltd 2017

Get/Set Valuesfor(j = 0; j < rowChanges.size(); j++){ oneRowChange = rowChanges.get(j); columns = oneRowChange.getColumnSpec(); columnValues = oneRowChange.getColumnValues(); for (c = 0; c < columns.size(); c++) { columnSpec = columns.get(c); type = columnSpec.getType(); if (type == TypesDATE || type == TypesTIMESTAMP) { for (row = 0; row < columnValues.size(); row++) { values = columnValues.get(row); value = values.get(c);

if (value.getValue() == 0) { value.setValueNull() } } } }}

©Continuent Ltd 2017

What you can do in a filter

• Anything

©Continuent Ltd 2017

Case Study: Building a Kafka

Applier

©Continuent Ltd 2017

Kafka?

• Message queue/bus

• Full publish/subscribe model

• Huge flexible

• Very practical

• High performance

• Not a database

©Continuent Ltd 2017

Message Format for Data

• Embedded JSON

• CSV Row

• Encoded binary fields

• Message topic?

• Schema/table/primary key?

©Continuent Ltd 2017

Impedance

• What happens with multi-row transactions?

• What happens when a multi-row transaction is not applied?

• Should we split data into chunks?

©Continuent Ltd 2017

What we do already

Sources TargetsMySQL MySQLOracle Oracle

RedShiftVerticalHadoop

TextSQLite

RabbitMQS3

MongoDB

©Continuent Ltd 2017

What are we adding?

Sources TargetsREST API Input Cassandra

MongoDB Amazon AthenaCouchbase CouchbaseCouchDB CouchDB

PostgreSQL ElasticSearchFlumeKafka

Native JDBC to HadoopPostgreSQL

©Continuent Ltd 2017

Where Next

• github.com/continuent/tungsten-replicator

• mcb.guru

top related