oracle timesten in-memory database database in-memory option •dual-format in-memory database...
TRANSCRIPT
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle TimesTen In-Memory Database Architecture, Performance Tips, Use Cases
Chris Jenkins ([email protected])Senior Director, In-Memory Technology
TimesTen Product Management
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Best In-Memory Databases: For Both OLTP and Analytics
Application Application Application
Application Application Application
Oracle TimesTen In-Memory Database• Lightweight, highly-available IMDB
• Primary use case: Extreme OLTP
• Microsecond response time
• Millions of TPS on commodity hardware
Oracle Database In-Memory Option• Dual-Format In-Memory Database
• Primary use case: Real Time Analytics
• Billions of Rows/Sec scan rate
• Faster mixed-workload enterprise OLTP
Fewer indexes needed to support analytics
In-Memory for OLTP
In-Memory for Analytics
4
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle TimesTen In-Memory Database
TimesTen Scaleout
• Scale-Out In-Memory Relational Database
• Highly Available
• Extremely high throughput reads and writes
• Scales both reads and writes
• Replicated In-Memory Relational Database
• Highly Available
• Extremely low latency reads and writes
• Read scaling across multiple hosts
• Cache Groups (cache for Oracle DB EE)
TimesTen Classic
Applicationread/writes
Application Reads from Standby
Application Reads from Subscribers
One product, two deployment modes
7
StandbyActive SubscriberSubscriber
SubscriberSingle System Image In-Memory Database
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
In-memory RDBMS
– Pure in-memory
– Entire database in RAM
– ACID compliant
– Standard SQL and APIs
Persistent and Recoverable– Database persisted on local storage
– Automatic recovery after failure
9
Extremely Fast– Low, consistent response time
– Very high throughput
– Excellent scalability
Highly Available– Active-standby and multi-master
replication
– Highly parallel, high throughput
– Async and Sync
– HA and DR
TimesTen Classic
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Architecture: Classic Instance
Millionths of a
Second
10
In-Memory
Database(s)
Admin/Utility programs
Log files
Checkpoint files
Data Store subdaemon(s)
TimesTen daemon
Client-Server
Applications
and Tools
Application
TT Client
Serverdaemon
Serverproxies
Serverproxies
Oracle
RDBMS
Cacheagent(s)
Direct mode
Applications
and Tools
ApplicationCode
TimesTen Data
Manager Library
ApplicationCode
TimesTen Data
Manager Library
Network
Instance
Millionths of a
Second
Data Tables, Indexes,
System Tables
Lo
cks,
Cu
rso
rs,
Tem
p I
nd
exes,
…
Log
Buffer
Metadata, Tables,
Indexes, Views,
Sequences, …
Co
mm
an
d c
ach
e,
Replication agent(s)Replication agent(s)
TimesTenReplica DB
• Installation– An unzipped copy of the
TimesTen software package– Immutable
• Instance
– Created using ttInstanceCreate
– A runnable copy of the software
– Linked to an installation
– Includes configuration files
– Identified by TIMESTEN_HOME
– Set of processes
– Supports one or more databases
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Two modes, functionally equivalent , supported for all APIs (ODBC, OCI, JDBC, ODP.NET, …) Application Connectivity
11
Host 1
Database
SharedMemory
App 1
App 2
App N
…
Direct mode• Apps run on same host as database• Apps directly map database shared memory (via TT engine)• No context switches, no IPC for database access• Ultra low latency (in process direct memory access)
TT Server X
…
TT Server 2
TT Server 1
Host 2
App 1
App 2
App N
…
Network
Client/server mode• TCP/IP connections between apps and TT server processes• TT server process is a multi-threaded direct mode app• Each interaction involves 1 or more n/w round trips• More overhead, higher latency than direct mode
You can mix and match these modes as desired based on your requirements.
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Database is a single shared memory segment (separate much smaller segment for PL/SQL) Memory Layout
12
Database Shared Memory Segment
Persistent Not to scale
TimesTen is a row store• At least for now…
Use of huge pages• Recommended unless database is
very small• Mandatory on Linux if segment size
>= 256 GB
If not using huge pages• Lock segment in memory
• MemoryLock=4
Consider NUMA effects…
~20 MB
DbHdr
PermSize
Perm Region
Tables• Logical Tuple Pages• Physical Tuple Pages• Page directories
Indexes• Hash buckets• B-Tree nodes
Metadata• System catalog tables• System views• Sequences• System data
TempSize
Temp Region
Temp objects• Tables• Indexes
Sort space
Locks
Connections
Commit buffers
…
LogBufMB
Log Buffer
Strand 1
Strand 2
Strand N
…
…
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Dual checkpoint files, automatic fuzzy checkpoints Persistence - Checkpointing
13
Two checkpoint files, dbname.ds0 and dbname.ds1• Each is a full image of DbHdr + Perm
• Written to alternately by checkpoint operations
Perm region is divided into pages• Variable size based on contents
• Two bits to track ‘dirty’ state wrt to checkpoint files
Last checkpoint was to .ds1 at some earlier time T0 At time T1
• Page 0 is dirty wrt .ds0 and .ds1
• Page 2 is dirty wrt .ds0
Checkpoint occurs at time T2• .ds0 is oldest file (previous checkpoint was to .ds1)
• DbHdr + pages 0 & 2 are written to .ds0 (in place update)
• Pages 0 & 2 dirty bits for .ds0 are cleared
At time T3• Page 1 is modified (e.g. by a DML operation)
• Both dirty bits are set in the page
• .ds0 dirty: Pg1
• .ds1 dirty: Pg0, Pg1
Checkpoint occurs at time T4
• .ds1 is oldest file
• DbHdr + pages 0 and 1 are written to .ds1
• Dirty bit for .ds1 cleared in both pages
Perm Region
DbHdr
Pg0
1 1
Pg1
0 0
Pg21 0
…
PgN0 0
.ds0 .ds1
0
1 100 0
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
High throughput and concurrencyPersistence – Transaction Logging
14
Parallel log manager• Multi-strand buffer (multiple ring buffers)
• Behaves as a single shared logical buffer
Log files• Configurable maximum size
• Sequence number (64-bit)
• Old files purged by checkpointing
Highly concurrent• Concurrent record post to each strand AND
• Flush buffer to disk
Asynchronous and synchronous operation• Asynchronous (DurableCommits=0) is the default
• Synchronous (DurableCommits=1) is an option
Async/sync configurable at• Database level
• Connection level
• Transaction level (application API)
Log usage• Undo and Redo
• Replication, XLA, AWT caching & incremental backups
Log file purge criteria• No records in file belong to an open transaction
• Changes for all records written to both checkpoint files
• Not required by replication, XLA, AWT or backups
In-memory Database
CurrentLog File
Async• Many records
written at once
• Outside transaction
path
• Decoupled from
commit
Sync• Less records written
at once
• Part of commit
• Write through to
media
• Commit blocks until
I/O completion
• Group commit
optimisation
Log BufferStrand 1 Strand N
…
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Occurs on every ramLoad operationPersistence – Recovery
15
Most recent checkpoint loaded into memory• Physical operation
• Can use multiple reader threads (configurable)
Log replay• From the corresponding checkpoint log record…
• …to end of log
• Mark any indexes that would be modified
Rollback any open transactions• No commit marker seen
Drop and re-build marked indexes• Index modifications are not redo logged
• Done in parallel (configurable)
Static checkpoint to oldest checkpoint file
Allow application connections
Database Shared
Segment
Logfiles
CkptFiles
Rollback anyopen txns
Drop marked indexesRebuild them
Apps
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Single level versioning for READ COMMITTED isolationConcurrency
16
READ COMMITTED isolation (default)• Readers don’t block writers and vice versa
• Readers don’t place any locks
INSERT/UPDATE/DELETE• Creates a new (uncommitted) version of affected rows
• eXclusively locks the row(s)
Reader• Reads old (committed) version(s)
Old version(s) deleted on COMMIT or ROLLBACK• In reclaim phase
• New version(s) becomes the only version
SERIALIZABLE isolation• Readers place Shared row locks
• Writers place eXclusive row locks
• Optimiser may choose table locks instead
If many rows would be locked
• Can use a hint to prevent lock escalation
pKey columnX
This is row 0
This is row 1
This is row 2
This is row 3
This is row 4
This is row 5
This is row 6
1
0
5
6
2
3
4 This is row 4 4
MYTABLE READER
UPDATE 4 New row 4 New row 4 4
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Hash Indexes, Lockless B-TreesIndexing
17
Hash Indexes
• Best performance for
- Full key equality lookups
- Full key equijoins
• Can’t be used for
- Range lookups
- Key prefix lookups
• Must be sized accurately
- Too small: poor performance and concurrency
- Too large: wastes memory
Memory optimized B-Tree Indexes
• Default index type
• Good all-round performance
• Self balancing
• Lockless design for high concurrency
– No locks or latches for reads
– Fine grained latching for writes
CREATE[UNIQUE]HASH INDEX MyHashIndex
ON MyTable(somecol)PAGES = CURRENT;
CREATE[UNIQUE]INDEX MyRangeIndex
ON MyTable (somecol);
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Distributed, Shared Nothing, In-Memory Database
• Appears to applications as a single database
- Not as a sharded database
• Online scale-out and scale-in
- Data automatically redistributed
- Workload automatically uses new elements
• Built-in HA via multiple fully-active copies
– Copies automatically kept in sync
– Automatic client failover
- Parallel SQL execution
- Same features as TimesTen Classic
- Mostly…
Single-Image Database with High Availability and Elasticity
19
A
A’
B’
BC
C’
D
D’
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• DISTRIBUTE large tables by consistent hash
Distribute CUSTOMER rows across all elements by hash of Customer ID
• COLOCATE child table rows with parent table row to maximize locality
Place ORDERS rows in same element along with corresponding CUSTOMER row
• DUPLICATE small read-mostly tables on all elements for maximum locality
Duplicate the PRODUCT list on all elements
20
Scaleout Table Data Distribution Options
Servers
Element 1 Element N …
CUST
ORDERS
PRODUCTS
CUST
ORDERS
PRODUCTS
Distribute
Duplicate
Colocate Colocate
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
RS1_DSG1 RS1_DSG2
2PC
RS2_DSG1 RS2_DSG2
RS3_DSG1 RS3_DSG2
RS4_DSG1 RS4_DSG2
MGMT1
MGMT2
Internal Network
External Network
2PC
2PC
2PC
…ZooKeeper
Membership Management
SSH & SCP
MOUNT / SCP
SSH
Management instances
Data instancesRepository Storage Hosts
REPO1
TimesTen ScaleoutArchitecture Overview
21
SQL
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Architecture: Scaleout Instances
22
Millionths of a
SecondDatabase
Element(s)
Network
Log files
Checkpoint files
Data Store subdaemon(s)
TimesTen daemon
Client-Server
Applications
and Tools
Application
TT Client
Serverdaemon
Serverproxies
Serverproxies
Direct mode
Applications
and Tools
ApplicationCode
TimesTen Data
Manager Library
ApplicationCode
TimesTen Data
Manager Library
Data Instance
Millionths of a
Second
Data Tables, Indexes,
System Tables
Lo
cks,
Cu
rso
rs,
Tem
p I
nd
exes,
…
Log
Buffer
Metadata, Tables,
Indexes, Views,
Sequences, …
Co
mm
an
d c
ach
e,
Epochfiles
Replication agent(s)GridWorkers
Other Data Instances
ManagementInstances(s)
• Management Instance
– Stores grid model
– Processes management functions
– Very similar to a ‘classic’ instance
– 1 or 2 per Scaleout grid
• Data Instance– Hosts database elements– Processes SQL and Txns– As many as you want
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
High Availability
• Hosts assigned to Data Space Groups
– DSG = rack, Cloud AD etc.
• Replica sets created automatically
– One replica per DSG
• All replicas are active for reads and writes
– All replicas are ‘equal’
• Always consistent
– 2 phase commit (synchronous)
– Reduced 2PC durable writes when K > 1
• Queries and transactions can span all replica sets
– Grid aware optimizer
– Grid aware SQL engine
– Sophisticated transaction manager
K-safety, Synchronous , All Replicas Active
A’
A
B’
BC
C’
D
D’
23
Data SpaceGroup 1
Data Space Group 2
Replica Set 1
Replica Set 2
Replica Set 3
Replica Set 4
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Elastic Scalability
• Adding replica sets
- Prepare
- Deploy new hosts/installations/instances
- Typically one command per host
- Redistribute data when ready
- One command
- Workload uses the new elements
- Connections start to use new elements
• Removing replica sets
- Redistribute data
- Tear down old hosts
Expand and shrink the database online, based on business needs
24
E’E
B’
A
C
A’
B
C’
D D’
Replica Set 1
Replica Set 2
Replica Set 3
Replica Set 4
Replica Set 5
Data Space Group 1
Data Space Group 2
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 26
Top Performance Tips for TimesTen
• Optimise data types– Use native integer types where appropriate (TT_TINYINT, TT_SMALLINT, …)
– Inline versus out of line for variable length columns (VARCHAR2, NVARCHAR2, VARBINARY)
• Indexes and optimiser statistics– Hash versus range indexes – use the index advisor!
– Keep optimizer stats up to date
• Optimise OS configuration– Shared memory, semaphores
– Huge pages, locked memory
• Optimise database configuration– Configure database parameters depending on workload, hardware etc.
– If using HDD storage, separate checkpoint files and transaction logs – avoid I/O contention
– Use huge pages (or lock database in memory if can’t use huge pages)
General
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 27
Top Performance Tips for TimesTen
• Use the fastest, lowest latency network you can get– 10 GbE is absolute minimum
– Faster is better!
• Choose hardware wisely– Fewer, larger hosts better than many small hosts
• Less network hops
• Leverages TimesTen’s excellent vertical scalability
• Optimise table distributions– Distribution type, distribution keys (hash)
• Global versus local indexes– Faster access to data
– Slower DML
– Tradeoffs!
Scaleout
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 28
Top Performance Tips for TimesTen
• Use parameterized SQL– Facilitates use of (shared) prepared statements
• Prepare once, execute many times– Hard parse >> soft parse >>>> no parse
• Bind once (ODBC and OCI)
• Minimise type conversions and character set conversions
• Leverage batch operations, especially for INSERT– Multiple of 256 rows => fast path insert
• Prefer direct mode connectivity where you can
• When using client/server connectivity– Use PL/SQL to reduce network round trips
– Use OCI ‘commit on success’ option where appropriate
• Keep transactions small/short
Applications
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 30
When to use TimesTen? Classic or Scaleout?
TimesTen Classic
• Very low, consistent latency– Microsecond response times
• High throughput– Millions of TPS on a commodity server
– 10s of millions of queries per second
• Single server or replicated– Optional read-only subscribers
– Vertical scalability for writes
– Vertical and horizontal for reads
• Highly available (99.999% possible)
• Cache functionality for Oracle DB EE– Read-only and read-write caching
TimesTen Scaleout
• Good latency– Low millisecond response times
• Very high throughput– 100s of millions of TPS
– Billions of queries per second
• Vertical and horizontal scalability– For both reads and writes
• Easy HA (99.999% possible)
• Elastic scalability– Add or remove database elements online
Primarily for OLTP, Scaleout useful for Analytics too
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
What is TimesTen?
TimesTen Classic Architecture
TimesTen Scaleout Architecture
Performance Tips & Tricks
When to use TimesTen?
Summary
1
2
3
4
5
6
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 32
Summary
• TimesTen is a sophisticated relational In-Memory Database
• Two deployment modes; Classic and Scaleout
• Focus is primarily on OLTP type workloads
–Ultra low latency – Classic
–Massive throughput – Scaleout
• Standard SQL, PL/SQL and APIs
• Fully ACID
• Highly available
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
More Info…• TimesTen OTN Portal
(http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html)
– Product Information• Presentations, use cases, whitepapers, FAQs, …
– Software Downloads
– Product Documentation
– Scaleout Demo / Learning VM download
• TimesTen GitHub Quickstart and Samples(https://github.com/oracle/oracle-timesten-samples)
• Contact me! ([email protected])
33