application performance: database-related problems

Post on 18-Jan-2017

21.120 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

www.luxoft.com

APPLICATION PERFORMANCE: DATABASE-RELATED PROBLEMSEvgeniy Khyst26.04.2016

www.luxoft.com

Application Performance: Database-related Problems● Application performance;● Common performance problems and their solutions;● Database-related problems;● Lock contention;● Locking mechanism;● Transaction isolation level;● URL shortener example;● Hi/Lo algorithms;● Payment system example.

www.luxoft.com

Application Performance

● Key performance metrics:­ Request processing time;­ Throughput;

● Poor performance:­ Long time to process single requests;­ Low number of requests processed per second.

www.luxoft.com

Request Processing Time

Request processing time = 4 seconds

www.luxoft.com

Throughput

Throughput = 1/4 req/sec = 15 req/min

www.luxoft.com

Throughput

Throughput = 3/4 req/sec = 45 req/min

www.luxoft.com

Throughput

Throughput = 10/4 req/sec = 150 req/sec

www.luxoft.com

Common Performance Problems and Their Solutions

● Database-related problems;● JVM performance problems;● Application specific performance problems;● Network-related problems.

www.luxoft.com

Database-related Performance Problems

● Query execution time is too big;● Too much queries per single business function;● Database connection management problems.

www.luxoft.com

Query Execution Time is Too Big

● Missing indexes;● Slow SQL queries (sub-queries, too many JOINs etc);● Slow SQL queries generated by ORM;● Not optimal JDBC fetch size;● Not parameterized statements for queries;● Lack of proper data caching;● Lock contention.

www.luxoft.com

Missing Indexes

To find out what indexes to create look at query execution plan.

In Oracle database it is done as follows:EXPLAIN PLAN FOR SELECT isbn FROM book;

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());

www.luxoft.com

TABLE ACCESS FULL

● Full table scan is a scan made on a database where each row of the table under scan is read in a sequential order and the columns encountered are checked for the validity of a condition;

● Full table scans are the slowest method of scanning a table in most of the cases;

● Create missing indexes to search by index instead of performing full table scan.

www.luxoft.com

Slow SQL Queries

● Slow SQL queries (sub-queries, too many JOINs etc):Solution: Rewrite query

● Slow SQL queries generated by ORM:­ JPQL/HQL and Criteria API queries are translated to SQL;Solutions:­ Rewrite JPQL/HQL, Criteria API queries;­ Replace with plain SQL query.

www.luxoft.com

Not Optimal JDBC Fetch Size

JDBC allows to specify the number of rows fetched with each database round-trip for a query, and this number is referred to as the fetch size.

Solutions:● java.sql.Statement.setFetchSize(rows)

● hibernate.jdbc.fetch_size property

www.luxoft.com

Not Parameterized Statements for Queries

When a database receives SQL statement it:●parses the statement and looks for syntax errors,●does the access plan generation (checks what indexes can

be used etc),●executes statement.

Problem: Access plan generation takes CPU power.

www.luxoft.com

Not Parameterized Statements for Queries

●Database caches computed access plan;●JEE application servers cache PreparedStatement

instances.

Solution: Reusing the previous access plan or PreparedStatement saves CPU power.

www.luxoft.com

Not Parameterized Statements for Queries

The entire statement is the key in cache.

SELECT * FROM tbl WHERE name='a' andSELECT * FROM tbl WHERE name='b'

will have different entries in cachesbecause the name='b' is different from the cached name='a'.

www.luxoft.com

Not Parameterized Statements for Queriesfor (String name : names) { Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("SELECT * FROM tbl WHERE name = " + name); /* … */}

The cache won't be used, a new access plan is computed for each iteration.

PreparedStatement ps = conn.prepareStatement("SELECT * FROM tbl WHERE name = ?");for (String name : names) { ps.setString(1, name); ResultSet rs = ps.executeQuery(); /* … */}

Database reuses the access plan for the statement parameterized using the '?'.

www.luxoft.com

Lack of Proper Data Caching

Solutions:● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.

www.luxoft.com

Lock Contention

Operations are waiting to obtain lock for a long time due to high lock contention.

Solution:Revise application logic and implementation:● Update asynchronously;● Replace updates with inserts (inserts are not blocking).

www.luxoft.com

Too Much Queries per Single Business Function

● Insert/update queries executed in a loop;● "SELECT N+1" problem;● Reduce number calls hitting database.

www.luxoft.com

Insert/Update Queries Executed in a Loop

● Use JDBC batch (keep batch size less than 1000);● hibernate.jdbc.batch_size property;● Periodically flush changes and clear

Session/EntityManager to control first-level cache size.

www.luxoft.com

JDBC Batch ProcessingPreparedStatement preparedStatement = connection.prepareStatement("UPDATE book SET title=? WHERE isbn=?");

preparedStatement.setString(1, "Patterns of Enterprise Application Architecture");preparedStatement.setString(2, "007-6092019909");

preparedStatement.addBatch();

preparedStatement.setString(1, "Enterprise Integration Patterns");preparedStatement.setString(2, "978-0321200686");

preparedStatement.addBatch();

int[] affectedRecords = preparedStatement.executeBatch();

for (int i=0; i<100000; i++) {    Book book = new Book(.....);    session.save(book);    if ( i % 20 == 0 ) { // 20, same as the JDBC batch size       // flush a batch of inserts and release memory:       session.flush();       session.clear();    }}

www.luxoft.com

"SELECT N+1" Problem

● The first query will selected root entities only, and each associated collection will be selected with additional query.

● So persistence provider generates N+1 SQL queries, where N is a number of root entities in result list of user query.

www.luxoft.com

"SELECT N+1" Problem

Solutions:● Use different fetching strategy or entity graph;● Make child entities aggregate roots and use DAO methods

to fetch them:­ Replace bidirectional one-to-many mapping with unidirectional;

● Enable second-level and query cache.

www.luxoft.com

Reduce Number Database Calls

Solutions:● Use Hi/Lo algorithms;● Enable ORM second-level cache;● Enable ORM query cache;● Implement custom cache.

www.luxoft.com

Database Connection Management Problems

● Application is using too much DB connections:­ Application is not closing connections after usingSolution: Close all connections after using­ DB is not able to handle that much connections application uses Solution: Use connection pooling

● Application is waiting to get connection from pool too longSolution: Increase pool size

www.luxoft.com

JVM Performance Problems

Excessive JVM garbage collections slows down application.Solutions:● Analyze garbage collector logs:

­ Send GC data to a log file, enable GC log rotation:-Xloggc:gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M -XX:+PrintGCTimeStamps

● Tune GC:­ Use Garbage-First Collector: -XX:+UseG1GC

www.luxoft.com

Application Specific Performance Problems

Resource consuming computations: ● Algorithms with complexity O(N2), O(2N);● Asymmetric RSA encryption;● Bcrypt hashing during authentication;● Etc.

Solution: Horizontal scalability. Increase number of instances capable of processing requests and balance load (create cluster).

www.luxoft.com

Network-related Problems

● Network latency;● Not configured timeout:

­ mail.smtp.connectiontimeout Socket connection timeout. Default is infinite timeout.

­ mail.smtp.timeout Socket read timeout. Default is infinite timeout.

www.luxoft.com

Reducing Lock Contention

● Database-related problems­ Query execution time is too big

• Lock contention

Solutions:● Use Hi/Lo algorithms;● Update asynchronously;● Replace updates with inserts.

www.luxoft.com

Locking Mechanism

Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource.

In general, multi-user databases use some form of data locking to solve the problems associated with:● data concurrency,● consistency,● integrity.

www.luxoft.com

Isolation Levels vs Locks

● Transaction isolation level does not affect the locks that are acquired to protect data modifications.

● A transaction always gets an exclusive lock on any data it modifies and holds that lock until the transaction completes, regardless of the isolation level set for that transaction.

● For read operations transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions.

www.luxoft.com

Preventable Read Phenomena

● Dirty reads - A transaction reads data that has been written by another transaction that has not been committed yet.

● Nonrepeatable reads - A transaction rereads data it has previously read and finds that another committed transaction has modified or deleted the data.

● Phantom reads - A transaction reruns a query returning a set of rows that satisfies a search condition and finds that another committed transaction has inserted additional rows that satisfy the condition.

www.luxoft.com

Standard Transaction Isolation Levels

● Read uncommited● Read commited● Repeatable reads● Serializable

www.luxoft.com

Isolation Levels vs Read Phenomena

Dirty reads Nonrepeatable reads Phantom reads

Read uncommited Possible Possible Possible

Read commited Not­possible Possible Possible

Repeatable reads Not­possible Not­possible Possible

Serializable Not­possible Not­possible Not­possible

www.luxoft.com

Default Isolation Level

Read commited isolation level is default.

www.luxoft.com

Read Commited Isolation Level

In read commited reads are not blocking.

www.luxoft.com

Read Commited Isolation Level

Conflicting writes in read commited transactions.

www.luxoft.com

Pessimistic and optimistic locking are concurrency control mechanisms.

Pessimistic locking is a strategy when you lock record when reading and then modify:

SELECT name FROM tbl FOR UPDATE;UPDATE tbl SET name = 'new value';

Optimistic locking is a strategy when you read record with version number and then check this version when updating:SELECT name, version FROM tbl;

UPDATE tbl SET name = 'new value', version = version + 1 WHERE version = :version;

Pessimistic and Optimistic Locking

www.luxoft.com

● Pessimistic locking prevents lost updates and makes updates serial (FIFO) reducing throughput;

● Optimistic locking just prevents lost updates;● If version check in optimistic locking fails, read and update

queries should be re-executed;● Optimistic locking allows to reduce time the lock is held and

sometimes increases throughput.

Pessimistic and Optimistic Locking

www.luxoft.com

URL Shortener Example

Requirements:● Receives URL and returns "shortened" version;● E.g. post "http://github.com" to "http://url-shortener/s/"

and get back "http://url-shortener/s/2Bi";● The shortened URL can be resolved to original URL. E.g.

"http://url-shortener/s/2Bi" will return "http://github.com";● Shortened URLs that were not accessed longer than some

specified amount of time should be deleted.

www.luxoft.com

URL Shortener Example

● Each time URL is submitted a new record is inserted into the database;

● Insert operations do not introduce locks in database;● For primary key generation database sequence is used;● The Hi/Lo algorithm allows to reduce number of database

hits to improve performance.

www.luxoft.com

URL Shortener Example

● Original URL’s primary key is converted to radix 62:­ Radix 62 alphabet contains digits lower- and upper-case letters:

10000 in radix 10 = 2Bi in radix 62;● String identifying original URL is converted back to radix

10 to get primary key value and original URL can be found by ID.

www.luxoft.com

URL Shortener Example

E.g. URL "http://github.com/" shortened to "http://url-shortener/s/2Bi":● Inserting new record to database with id 10000 for original

URL "http://github.com/" representing "shortened" URL● Converting id 10000 to radix 62: 2Bi

www.luxoft.com

URL Shortener Example

● During each shortened URL resolving last view timestamp is updated in database and total number of views column is incremented;

● These update should be asynchronous to not reduce performance due to lock contention;

● Absence of update operations gives application better scalability and throughput.

www.luxoft.com

Update Asynchronously

● When URL is resolved JMS message is sent to queue;● Application consumes messages from queue and updates

records in database;● During URL resolving there are no update operations.

www.luxoft.com

Hi/Lo Algorithms

The usage of Hi/Lo algorithm allows different application nodes not to block each other.

www.luxoft.com

Hi/Lo Algorithms

JPA mapping:@SequenceGenerator(name = "MY_SEQ", sequenceName = "MY_SEQ",

allocationSize = 50)

allocationSize = N - fetch the next value from the database once in every N persist calls and locally (in-memory) increment the value in

between.

Sequence DDL:CREATE SEQUENCE MY_SEQ INCREMENT BY 50 START WITH 50;

INCREMENT BY should match allocationSizeSTART WITH should be greater or equal to allocationSize

www.luxoft.com

Payment System Example

Requirements:● Users can add funds on their accounts (add funds)● Users can pay to shops with funds from their accounts

(payment)● Users and shops can withdraw money from their accounts

(withdraw funds)● Account balance must be always up to date

www.luxoft.com

Simple Solution 1

● Store account balance in table and update on each operation.

● Advantage: Simple

www.luxoft.com

Simple Solution 1 - Data Model

Table ACCOUNT_BALANCE

ACCOUNT_ID BALANCE

www.luxoft.com

Simple Solution 1 - Queries

UPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :amount WHERE ACCOUNT_ID = :account

SELECT ACCOUNT_ID, BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account

www.luxoft.com

Simple Solution 1 - Problems

● Update operations introduce locks;● During Christmas holidays users can make hundreds of

payments simultaneously;● Due to lock contention payments will be slow;● System have low throughput.

www.luxoft.com

Simple Solution 2

● Do not store account balance at all;● Store details of each transaction;● Calculate balance dynamically based on transaction log;● Advantages:

­ Still simple enough;­ No update operations at all.

www.luxoft.com

Simple Solution 2 - Data Model

Table TRANSACTION_LOG

TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT

www.luxoft.com

Simple Solution 2 - Queries

● Payment and withdrawal are 2-step operations:­ Authorization step;­ Fulfillment step;

● First, authorization step is done in separate transaction;● Next, balance check and fulfillment step are done in other

transaction.

www.luxoft.com

Simple Solution 2 - Queries

-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)

-- Balance check and fulfillment in new transactionSELECT ACCOUNT_ID, SUM(TX_AMOUNT) AS BALANCE FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account

UPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id

www.luxoft.com

Simple Solution 2 - Problems

● Users can make thousands of transactions per day;● During Christmas holidays users can make thousands of

payments per hour;● Number of transactions continuously grow;● More records in TRANSACTION_LOG table - slower requests.

www.luxoft.com

Better Solution

● Store balance on yesterday in table;● Update account balance once a day in background;● Store details of each transaction;● Calculate balance dynamically based on value of balance

on yesterday and transactions made today from transaction log.

www.luxoft.com

Better Solution - Data Model

Table ACCOUNT_BALANCE

Table TRANSACTION_LOG

ACCOUNT_ID BALANCE_DATE BALANCE

TX_ID TX_TYPE TX_STATUS TX_DATE ACCOUNT_ID TX_AMOUNT

www.luxoft.com

Better Solution - Queries

-- Authorization in new transactionINSERT INTO TRANSACTION_LOG(TX_ID, TX_TYPE, TX_STATUS, TX_DATE, ACCOUNT_ID, TX_AMOUNT) VALUES(:id, :type, 'AUTHORIZED', :date, :account, :amount)

-- Balance check and fulfillment in new transactionUPDATE TRANSACTION_LOG SET TX_STATUS = 'FULFILLED' WHERE TX_ID = :id

-- Executed once a day at midnightUPDATE ACCOUNT_BALANCE SET BALANCE = BALANCE + :transactionLogSum, BALANCE_DATE = :lastTransactionLogDate WHERE ACCOUNT_ID = :account

www.luxoft.com

Better Solution - Queries

SELECT ACCOUNT_ID, BALANCE_DATE, BALANCE AS CACHED_BALANCE FROM ACCOUNT_BALANCE WHERE ACCOUNT_ID = :account

SELECT ACCOUNT_ID, MAX(TX_DATE) AS LAST_TX_LOG_DATE, SUM(TX_AMOUNT) AS TX_LOG_SUM FROM TRANSACTION_LOG WHERE ACCOUNT_ID = :account AND TX_DATE > :balanceDateGROUP BY ACCOUNT_ID

-- BALANCE = CACHED_BALANCE + TX_LOG_SUM

www.luxoft.com

Better Solution - Advantages

● No updates during payment operations - no locks● No locks - better throughput● Number of rows in query with SUM operation is limited (1

day)● Constant query execution time

www.luxoft.com

THANK YOU

top related