oracle timesten in-memory database database in-memory option •dual-format in-memory database...

35

Upload: others

Post on 27-Apr-2020

48 views

Category:

Documents


0 download

TRANSCRIPT

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle TimesTen In-Memory Database Architecture, Performance Tips, Use Cases

Chris Jenkins ([email protected])Senior Director, In-Memory Technology

TimesTen Product Management

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Best In-Memory Databases: For Both OLTP and Analytics

Application Application Application

Application Application Application

Oracle TimesTen In-Memory Database• Lightweight, highly-available IMDB

• Primary use case: Extreme OLTP

• Microsecond response time

• Millions of TPS on commodity hardware

Oracle Database In-Memory Option• Dual-Format In-Memory Database

• Primary use case: Real Time Analytics

• Billions of Rows/Sec scan rate

• Faster mixed-workload enterprise OLTP

Fewer indexes needed to support analytics

In-Memory for OLTP

In-Memory for Analytics

4

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle TimesTen In-Memory Database

TimesTen Scaleout

• Scale-Out In-Memory Relational Database

• Highly Available

• Extremely high throughput reads and writes

• Scales both reads and writes

• Replicated In-Memory Relational Database

• Highly Available

• Extremely low latency reads and writes

• Read scaling across multiple hosts

• Cache Groups (cache for Oracle DB EE)

TimesTen Classic

Applicationread/writes

Application Reads from Standby

Application Reads from Subscribers

One product, two deployment modes

7

StandbyActive SubscriberSubscriber

SubscriberSingle System Image In-Memory Database

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

In-memory RDBMS

– Pure in-memory

– Entire database in RAM

– ACID compliant

– Standard SQL and APIs

Persistent and Recoverable– Database persisted on local storage

– Automatic recovery after failure

9

Extremely Fast– Low, consistent response time

– Very high throughput

– Excellent scalability

Highly Available– Active-standby and multi-master

replication

– Highly parallel, high throughput

– Async and Sync

– HA and DR

TimesTen Classic

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Architecture: Classic Instance

Millionths of a

Second

10

In-Memory

Database(s)

Admin/Utility programs

Log files

Checkpoint files

Data Store subdaemon(s)

TimesTen daemon

Client-Server

Applications

and Tools

Application

TT Client

Serverdaemon

Serverproxies

Serverproxies

Oracle

RDBMS

Cacheagent(s)

Direct mode

Applications

and Tools

ApplicationCode

TimesTen Data

Manager Library

ApplicationCode

TimesTen Data

Manager Library

Network

Instance

Millionths of a

Second

Data Tables, Indexes,

System Tables

Lo

cks,

Cu

rso

rs,

Tem

p I

nd

exes,

Log

Buffer

Metadata, Tables,

Indexes, Views,

Sequences, …

Co

mm

an

d c

ach

e,

Replication agent(s)Replication agent(s)

TimesTenReplica DB

• Installation– An unzipped copy of the

TimesTen software package– Immutable

• Instance

– Created using ttInstanceCreate

– A runnable copy of the software

– Linked to an installation

– Includes configuration files

– Identified by TIMESTEN_HOME

– Set of processes

– Supports one or more databases

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Two modes, functionally equivalent , supported for all APIs (ODBC, OCI, JDBC, ODP.NET, …) Application Connectivity

11

Host 1

Database

SharedMemory

App 1

App 2

App N

Direct mode• Apps run on same host as database• Apps directly map database shared memory (via TT engine)• No context switches, no IPC for database access• Ultra low latency (in process direct memory access)

TT Server X

TT Server 2

TT Server 1

Host 2

App 1

App 2

App N

Network

Client/server mode• TCP/IP connections between apps and TT server processes• TT server process is a multi-threaded direct mode app• Each interaction involves 1 or more n/w round trips• More overhead, higher latency than direct mode

You can mix and match these modes as desired based on your requirements.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Database is a single shared memory segment (separate much smaller segment for PL/SQL) Memory Layout

12

Database Shared Memory Segment

Persistent Not to scale

TimesTen is a row store• At least for now…

Use of huge pages• Recommended unless database is

very small• Mandatory on Linux if segment size

>= 256 GB

If not using huge pages• Lock segment in memory

• MemoryLock=4

Consider NUMA effects…

~20 MB

DbHdr

PermSize

Perm Region

Tables• Logical Tuple Pages• Physical Tuple Pages• Page directories

Indexes• Hash buckets• B-Tree nodes

Metadata• System catalog tables• System views• Sequences• System data

TempSize

Temp Region

Temp objects• Tables• Indexes

Sort space

Locks

Connections

Commit buffers

LogBufMB

Log Buffer

Strand 1

Strand 2

Strand N

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Dual checkpoint files, automatic fuzzy checkpoints Persistence - Checkpointing

13

Two checkpoint files, dbname.ds0 and dbname.ds1• Each is a full image of DbHdr + Perm

• Written to alternately by checkpoint operations

Perm region is divided into pages• Variable size based on contents

• Two bits to track ‘dirty’ state wrt to checkpoint files

Last checkpoint was to .ds1 at some earlier time T0 At time T1

• Page 0 is dirty wrt .ds0 and .ds1

• Page 2 is dirty wrt .ds0

Checkpoint occurs at time T2• .ds0 is oldest file (previous checkpoint was to .ds1)

• DbHdr + pages 0 & 2 are written to .ds0 (in place update)

• Pages 0 & 2 dirty bits for .ds0 are cleared

At time T3• Page 1 is modified (e.g. by a DML operation)

• Both dirty bits are set in the page

• .ds0 dirty: Pg1

• .ds1 dirty: Pg0, Pg1

Checkpoint occurs at time T4

• .ds1 is oldest file

• DbHdr + pages 0 and 1 are written to .ds1

• Dirty bit for .ds1 cleared in both pages

Perm Region

DbHdr

Pg0

1 1

Pg1

0 0

Pg21 0

PgN0 0

.ds0 .ds1

0

1 100 0

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

High throughput and concurrencyPersistence – Transaction Logging

14

Parallel log manager• Multi-strand buffer (multiple ring buffers)

• Behaves as a single shared logical buffer

Log files• Configurable maximum size

• Sequence number (64-bit)

• Old files purged by checkpointing

Highly concurrent• Concurrent record post to each strand AND

• Flush buffer to disk

Asynchronous and synchronous operation• Asynchronous (DurableCommits=0) is the default

• Synchronous (DurableCommits=1) is an option

Async/sync configurable at• Database level

• Connection level

• Transaction level (application API)

Log usage• Undo and Redo

• Replication, XLA, AWT caching & incremental backups

Log file purge criteria• No records in file belong to an open transaction

• Changes for all records written to both checkpoint files

• Not required by replication, XLA, AWT or backups

In-memory Database

CurrentLog File

Async• Many records

written at once

• Outside transaction

path

• Decoupled from

commit

Sync• Less records written

at once

• Part of commit

• Write through to

media

• Commit blocks until

I/O completion

• Group commit

optimisation

Log BufferStrand 1 Strand N

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Occurs on every ramLoad operationPersistence – Recovery

15

Most recent checkpoint loaded into memory• Physical operation

• Can use multiple reader threads (configurable)

Log replay• From the corresponding checkpoint log record…

• …to end of log

• Mark any indexes that would be modified

Rollback any open transactions• No commit marker seen

Drop and re-build marked indexes• Index modifications are not redo logged

• Done in parallel (configurable)

Static checkpoint to oldest checkpoint file

Allow application connections

Database Shared

Segment

Logfiles

CkptFiles

Rollback anyopen txns

Drop marked indexesRebuild them

Apps

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Single level versioning for READ COMMITTED isolationConcurrency

16

READ COMMITTED isolation (default)• Readers don’t block writers and vice versa

• Readers don’t place any locks

INSERT/UPDATE/DELETE• Creates a new (uncommitted) version of affected rows

• eXclusively locks the row(s)

Reader• Reads old (committed) version(s)

Old version(s) deleted on COMMIT or ROLLBACK• In reclaim phase

• New version(s) becomes the only version

SERIALIZABLE isolation• Readers place Shared row locks

• Writers place eXclusive row locks

• Optimiser may choose table locks instead

If many rows would be locked

• Can use a hint to prevent lock escalation

pKey columnX

This is row 0

This is row 1

This is row 2

This is row 3

This is row 4

This is row 5

This is row 6

1

0

5

6

2

3

4 This is row 4 4

MYTABLE READER

UPDATE 4 New row 4 New row 4 4

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Hash Indexes, Lockless B-TreesIndexing

17

Hash Indexes

• Best performance for

- Full key equality lookups

- Full key equijoins

• Can’t be used for

- Range lookups

- Key prefix lookups

• Must be sized accurately

- Too small: poor performance and concurrency

- Too large: wastes memory

Memory optimized B-Tree Indexes

• Default index type

• Good all-round performance

• Self balancing

• Lockless design for high concurrency

– No locks or latches for reads

– Fine grained latching for writes

CREATE[UNIQUE]HASH INDEX MyHashIndex

ON MyTable(somecol)PAGES = CURRENT;

CREATE[UNIQUE]INDEX MyRangeIndex

ON MyTable (somecol);

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Distributed, Shared Nothing, In-Memory Database

• Appears to applications as a single database

- Not as a sharded database

• Online scale-out and scale-in

- Data automatically redistributed

- Workload automatically uses new elements

• Built-in HA via multiple fully-active copies

– Copies automatically kept in sync

– Automatic client failover

- Parallel SQL execution

- Same features as TimesTen Classic

- Mostly…

Single-Image Database with High Availability and Elasticity

19

A

A’

B’

BC

C’

D

D’

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

• DISTRIBUTE large tables by consistent hash

Distribute CUSTOMER rows across all elements by hash of Customer ID

• COLOCATE child table rows with parent table row to maximize locality

Place ORDERS rows in same element along with corresponding CUSTOMER row

• DUPLICATE small read-mostly tables on all elements for maximum locality

Duplicate the PRODUCT list on all elements

20

Scaleout Table Data Distribution Options

Servers

Element 1 Element N …

CUST

ORDERS

PRODUCTS

CUST

ORDERS

PRODUCTS

Distribute

Duplicate

Colocate Colocate

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

RS1_DSG1 RS1_DSG2

2PC

RS2_DSG1 RS2_DSG2

RS3_DSG1 RS3_DSG2

RS4_DSG1 RS4_DSG2

MGMT1

MGMT2

Internal Network

External Network

2PC

2PC

2PC

…ZooKeeper

Membership Management

SSH & SCP

MOUNT / SCP

SSH

Management instances

Data instancesRepository Storage Hosts

REPO1

TimesTen ScaleoutArchitecture Overview

21

SQL

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Architecture: Scaleout Instances

22

Millionths of a

SecondDatabase

Element(s)

Network

Log files

Checkpoint files

Data Store subdaemon(s)

TimesTen daemon

Client-Server

Applications

and Tools

Application

TT Client

Serverdaemon

Serverproxies

Serverproxies

Direct mode

Applications

and Tools

ApplicationCode

TimesTen Data

Manager Library

ApplicationCode

TimesTen Data

Manager Library

Data Instance

Millionths of a

Second

Data Tables, Indexes,

System Tables

Lo

cks,

Cu

rso

rs,

Tem

p I

nd

exes,

Log

Buffer

Metadata, Tables,

Indexes, Views,

Sequences, …

Co

mm

an

d c

ach

e,

Epochfiles

Replication agent(s)GridWorkers

Other Data Instances

ManagementInstances(s)

• Management Instance

– Stores grid model

– Processes management functions

– Very similar to a ‘classic’ instance

– 1 or 2 per Scaleout grid

• Data Instance– Hosts database elements– Processes SQL and Txns– As many as you want

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

High Availability

• Hosts assigned to Data Space Groups

– DSG = rack, Cloud AD etc.

• Replica sets created automatically

– One replica per DSG

• All replicas are active for reads and writes

– All replicas are ‘equal’

• Always consistent

– 2 phase commit (synchronous)

– Reduced 2PC durable writes when K > 1

• Queries and transactions can span all replica sets

– Grid aware optimizer

– Grid aware SQL engine

– Sophisticated transaction manager

K-safety, Synchronous , All Replicas Active

A’

A

B’

BC

C’

D

D’

23

Data SpaceGroup 1

Data Space Group 2

Replica Set 1

Replica Set 2

Replica Set 3

Replica Set 4

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Elastic Scalability

• Adding replica sets

- Prepare

- Deploy new hosts/installations/instances

- Typically one command per host

- Redistribute data when ready

- One command

- Workload uses the new elements

- Connections start to use new elements

• Removing replica sets

- Redistribute data

- Tear down old hosts

Expand and shrink the database online, based on business needs

24

E’E

B’

A

C

A’

B

C’

D D’

Replica Set 1

Replica Set 2

Replica Set 3

Replica Set 4

Replica Set 5

Data Space Group 1

Data Space Group 2

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 26

Top Performance Tips for TimesTen

• Optimise data types– Use native integer types where appropriate (TT_TINYINT, TT_SMALLINT, …)

– Inline versus out of line for variable length columns (VARCHAR2, NVARCHAR2, VARBINARY)

• Indexes and optimiser statistics– Hash versus range indexes – use the index advisor!

– Keep optimizer stats up to date

• Optimise OS configuration– Shared memory, semaphores

– Huge pages, locked memory

• Optimise database configuration– Configure database parameters depending on workload, hardware etc.

– If using HDD storage, separate checkpoint files and transaction logs – avoid I/O contention

– Use huge pages (or lock database in memory if can’t use huge pages)

General

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 27

Top Performance Tips for TimesTen

• Use the fastest, lowest latency network you can get– 10 GbE is absolute minimum

– Faster is better!

• Choose hardware wisely– Fewer, larger hosts better than many small hosts

• Less network hops

• Leverages TimesTen’s excellent vertical scalability

• Optimise table distributions– Distribution type, distribution keys (hash)

• Global versus local indexes– Faster access to data

– Slower DML

– Tradeoffs!

Scaleout

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 28

Top Performance Tips for TimesTen

• Use parameterized SQL– Facilitates use of (shared) prepared statements

• Prepare once, execute many times– Hard parse >> soft parse >>>> no parse

• Bind once (ODBC and OCI)

• Minimise type conversions and character set conversions

• Leverage batch operations, especially for INSERT– Multiple of 256 rows => fast path insert

• Prefer direct mode connectivity where you can

• When using client/server connectivity– Use PL/SQL to reduce network round trips

– Use OCI ‘commit on success’ option where appropriate

• Keep transactions small/short

Applications

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 30

When to use TimesTen? Classic or Scaleout?

TimesTen Classic

• Very low, consistent latency– Microsecond response times

• High throughput– Millions of TPS on a commodity server

– 10s of millions of queries per second

• Single server or replicated– Optional read-only subscribers

– Vertical scalability for writes

– Vertical and horizontal for reads

• Highly available (99.999% possible)

• Cache functionality for Oracle DB EE– Read-only and read-write caching

TimesTen Scaleout

• Good latency– Low millisecond response times

• Very high throughput– 100s of millions of TPS

– Billions of queries per second

• Vertical and horizontal scalability– For both reads and writes

• Easy HA (99.999% possible)

• Elastic scalability– Add or remove database elements online

Primarily for OLTP, Scaleout useful for Analytics too

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Agenda

What is TimesTen?

TimesTen Classic Architecture

TimesTen Scaleout Architecture

Performance Tips & Tricks

When to use TimesTen?

Summary

1

2

3

4

5

6

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 32

Summary

• TimesTen is a sophisticated relational In-Memory Database

• Two deployment modes; Classic and Scaleout

• Focus is primarily on OLTP type workloads

–Ultra low latency – Classic

–Massive throughput – Scaleout

• Standard SQL, PL/SQL and APIs

• Fully ACID

• Highly available

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

More Info…• TimesTen OTN Portal

(http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html)

– Product Information• Presentations, use cases, whitepapers, FAQs, …

– Software Downloads

– Product Documentation

– Scaleout Demo / Learning VM download

• TimesTen GitHub Quickstart and Samples(https://github.com/oracle/oracle-timesten-samples)

• Contact me! ([email protected])

33

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

&34

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 35