postgresql and jdbc: striving for high performance

Post on 16-Apr-2017

720 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2016 NetCracker Technology Corporation Confidential

PostgreSQL and JDBC: striving for top performance

Vladimir SitnikovPgConf 2016

2© 2016 NetCracker Technology Corporation Confidential

About me

• Vladimir Sitnikov, @VladimirSitnikv• Performance architect at NetCracker• 10 years of experience with Java/SQL• PgJDBC committer

3© 2016 NetCracker Technology Corporation Confidential

Explain (analyze, buffers) PostgreSQL and JDBC

•Data fetch•Data upload•Performance•Pitfalls

4© 2016 NetCracker Technology Corporation Confidential

Intro

Fetch of a single row via primary key lookup takes 20ms. Localhost. Database is fully cached

A. Just fine C. Kidding? Aim is 1ms!B. It should be 1sec D. 100us

5© 2016 NetCracker Technology Corporation Confidential

Lots of small queries is a problem

Suppose a single query takes 10ms, then 100 of them would take a whole second *

* Your Captain

6© 2016 NetCracker Technology Corporation Confidential

PostgreSQL frontend-backend protocol

• Simple query• 'Q' + length + query_text•Extended query•Parse, Bind, Execute commands

7© 2016 NetCracker Technology Corporation Confidential

PostgreSQL frontend-backend protocol

Super extended queryhttps://github.com/pgjdbc/pgjdbc/pull/478backend protocol wanted features

8© 2016 NetCracker Technology Corporation Confidential

PostgreSQL frontend-backend protocol

Simple query•Works well for one-time queries•Does not support binary transfer

9© 2016 NetCracker Technology Corporation Confidential

PostgreSQL frontend-backend protocol

Extended query• Eliminates planning time• Supports binary transfer

10© 2016 NetCracker Technology Corporation Confidential

PreparedStatement

Connection con = ...;PreparedStatement ps = con.prepareStatement("SELECT..."); ...ps.close();

11© 2016 NetCracker Technology Corporation Confidential

PreparedStatement

Connection con = ...;PreparedStatement ps = con.prepareStatement("SELECT..."); ...ps.close();

12© 2016 NetCracker Technology Corporation Confidential

Smoker’s approach to PostgreSQL

PARSE S_1 as ...; // con.prepareStmt BIND/EXECDEALLOCATE // ps.close()PARSE S_2 as ...; BIND/EXECDEALLOCATE // ps.close()

13© 2016 NetCracker Technology Corporation Confidential

Healthy approach to PostgreSQL

PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ...DEALLOCATE

14© 2016 NetCracker Technology Corporation Confidential

Healthy approach to PostgreSQL

PARSE S_1 as ...; 1 once in a life BIND/EXEC REST call BIND/EXEC BIND/EXEC one more REST call BIND/EXEC BIND/EXEC ...DEALLOCATE “never” is the best

15© 2016 NetCracker Technology Corporation Confidential

Happiness closes no statements

Conclusion №1: in order to get top performance, you should not close statementsps = con.prepareStatement(...)ps.execueQuery();ps = con.prepareStatement(...)ps.execueQuery();...

16© 2016 NetCracker Technology Corporation Confidential

Happiness closes no statements

Conclusion №1: in order to get top performance, you should not close statements

ps = con.prepare...ps.execueQuery();ps = con.prepare...ps.execueQuery();...

17© 2016 NetCracker Technology Corporation Confidential

Unclosed statements in practice

@Benchmarkpublic Statement leakStatement() { return con.createStatement();}pgjdbc < 9.4.1202, -Xmx128m, OracleJDK 1.8u40# Warmup Iteration 1: 1147,070 ns/op# Warmup Iteration 2: 12101,537 ns/op# Warmup Iteration 3: 90825,971 ns/op# Warmup Iteration 4: <failure>java.lang.OutOfMemoryError: GC overhead limit exceeded

18© 2016 NetCracker Technology Corporation Confidential

Unclosed statements in practice

@Benchmarkpublic Statement leakStatement() { return con.createStatement();}pgjdbc >= 9.4.1202, -Xmx128m, OracleJDK 1.8u40# Warmup Iteration 1: 30 ns/op# Warmup Iteration 2: 27 ns/op# Warmup Iteration 3: 30 ns/op...

19© 2016 NetCracker Technology Corporation Confidential

Statements in practice

• In practice, application is always closing the statements• PostgreSQL has no shared query cache• Nobody wants spending excessive time on

planning

20© 2016 NetCracker Technology Corporation Confidential

Server-prepared statements

What can we do about it?• Wrap all the queries in PL/PgSQL• It helps, however we had 100500 SQL of them

• Teach JDBC to cache queries

21© 2016 NetCracker Technology Corporation Confidential

Query cache in PgJDBC

• Query cache was implemented in 9.4.1202 (2015-08-27)see https://github.com/pgjdbc/pgjdbc/pull/319• Is transparent to the application• We did not bother considering PL/PgSQL again• Server-prepare is activated after 5 executions

(prepareThreshold)

22© 2016 NetCracker Technology Corporation Confidential

Where are the numbers?

• Of course, planning time depends on the query complexity• We observed 20мс+ planning time for OLTP

queries: 10KiB query, 170 lines explain• Result is ~0ms

23© 2016 NetCracker Technology Corporation Confidential

Overheads

24© 2016 NetCracker Technology Corporation Confidential

Generated queries are bad

• If a query is generated• It results in a brand new java.lang.String object• Thus you have to recompute its hashCode

25© 2016 NetCracker Technology Corporation Confidential

Parameter types

If the type of bind value changes, you have to recreate server-prepared statementps.setInt(1, 42);...ps.setNull(1, Types.VARCHAR);

26© 2016 NetCracker Technology Corporation Confidential

Parameter types

If the type of bind value changes, you have to recreate server-prepared statementps.setInt(1, 42);...ps.setNull(1, Types.VARCHAR);

It leads to DEALLOCATE PREPARE

27© 2016 NetCracker Technology Corporation Confidential

Keep data type the same

Conclusion №1• Even NULL values should be properly typed

28© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

If using prepared statements, the response time gets 5'000 times slower. How’s that possible?

A. Bug C. FeatureB. Feature D. Bug

29© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

https://gist.github.com/vlsi -> 01_plan_flipper.sql

select * from plan_flipper -- <- table where skewed = 0 -- 1M rows and non_skewed = 42 -- 20 rows

30© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

https://gist.github.com/vlsi -> 01_plan_flipper.sql0.1ms 1st execution0.05ms 2nd execution0.05ms 3rd execution0.05ms 4th execution0.05ms 5th execution250 ms 6th execution

31© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

https://gist.github.com/vlsi -> 01_plan_flipper.sql0.1ms 1st execution0.05ms 2nd execution0.05ms 3rd execution0.05ms 4th execution0.05ms 5th execution

250 ms 6th execution

32© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

• Who is to blame?• PostgreSQL switches to generic plan after 5

executions of a server-prepared statement

• What can we do about it?• Add +0, OFFSET 0, and so on• Pay attention on plan validation• Discuss the phenomenon pgsql-hackers

33© 2016 NetCracker Technology Corporation Confidential

Unexpected degradation

https://gist.github.com/vlsi -> 01_plan_flipper.sqlWe just use +0 to forbid index on a bad columnselect * from plan_flipper where skewed+0 = 0 ~ /*+no_index*/ and non_skewed = 42

34© 2016 NetCracker Technology Corporation Confidential

Explain explain explain explain

The rule of 6 explains:prepare x(number) as select ...;explain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 1msexplain analyze execute x(42); -- 10 sec

35© 2016 NetCracker Technology Corporation Confidential

Везде баг

36© 2016 NetCracker Technology Corporation Confidential

Decision problem

There’s a schema A with table X, and a schema B with table X. What is the result of select * from X?

A.X C. ErrorB.X D. All of the above

37© 2016 NetCracker Technology Corporation Confidential

Search_path

There’s a schema A with table X, and a schema B with table X. What is the result of select * from X?• search_path determines the schema used• server-prepared statements are not prepared for

search_path changes crazy things might happen

38© 2016 NetCracker Technology Corporation Confidential

Search_path can go wrong

• 9.1 will just use old OIDs and execute the “previous” query• 9.2-9.5 might fail with "cached plan must not change

result type” error

40© 2016 NetCracker Technology Corporation Confidential

To fetch or not to fetch

You are to fetch 1M rows 1KiB each, -Xmx128m while (resultSet.next()) resultSet.getString(1);

A. No problem C. Must use LIMIT/OFFSET

B. OutOfMemory D. autoCommit(false)

41© 2016 NetCracker Technology Corporation Confidential

To fetch or not to fetch

• PgJDBC fetches all rows by default• To fetch in batches, you need Statement.setFetchSize and connection.setAutoCommit(false)• Default value is configurable via defaultRowFetchSize (9.4.1202+)

42© 2016 NetCracker Technology Corporation Confidential

fetchSize vs fetch time

10 50 100 1000 200002468

6.48

2.28 1.761.04 0.97

2000 rows

2000 rows

fetchSize

Fa

ster

, ms

select int4, int4, int4, int4

43© 2016 NetCracker Technology Corporation Confidential

FetchSize is good for stability

Conclusion №2:• For stability & performance reasons set defaultRowFetchSize >= 100

44© 2016 NetCracker Technology Corporation Confidential

Data upload

For data uploads, use• INSERT() VALUES()• INSERT() SELECT ?, ?, ?• INSERT() VALUES() executeBatch• INSERT() VALUES(), (), () executeBatch• COPY

45© 2016 NetCracker Technology Corporation Confidential

Healty batch INSERT

PARSE S_1 as ...; BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC BIND/EXEC ...DEALLOCATE

46© 2016 NetCracker Technology Corporation Confidential

TCP strikes back

JDBC is busy with sending queries, thus it has not started

fetching responses yet

DB cannot fetch more queries since it is busy with sending responses

47© 2016 NetCracker Technology Corporation Confidential

Batch INSERT in real life

PARSE S_1 as ...; BIND/EXEC BIND/EXECSYNC flush & wait for the response BIND/EXEC BIND/EXECSYNC flush & wait for the response ...

48© 2016 NetCracker Technology Corporation Confidential

TCP deadlock avoidance

• PgJDBC adds SYNC to your nice batch operations• The more the SYNCs the slower it performs

49© 2016 NetCracker Technology Corporation Confidential

Horror stories

A single line patch makes insert batch 10 times faster:

https://github.com/pgjdbc/pgjdbc/pull/380

- static int QUERY_FORCE_DESCRIBE_PORTAL = 128;+ static int QUERY_FORCE_DESCRIBE_PORTAL = 512;...// 128 has already been used static int QUERY_DISALLOW_BATCHING = 128;

50© 2016 NetCracker Technology Corporation Confidential

Trust but always measure

• Java 1.8u40+• Core i7 2.6Ghz• Java microbenchmark harness• PostgreSQL 9.5

51© 2016 NetCracker Technology Corporation Confidential

Queries under test: INSERT

pgjdbc/ubenchmark/InsertBatch.java

insert into batch_perf_test(a, b, c) values(?, ?, ?)

52© 2016 NetCracker Technology Corporation Confidential

Queries under test: INSERT

pgjdbc/ubenchmark/InsertBatch.java

insert into batch_perf_test(a, b, c) values(?, ?, ?)

53© 2016 NetCracker Technology Corporation Confidential

Queries under test: INSERT

pgjdbc/ubenchmark/InsertBatch.java

insert into batch_perf_test(a, b, c) values (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), (?, ?, ?), ...;

54© 2016 NetCracker Technology Corporation Confidential

Тестируемые запросы: COPY

pgjdbc/ubenchmark/InsertBatch.java

COPY batch_perf_test FROM STDIN1 s1 12 s2 23 s3 3...

55© 2016 NetCracker Technology Corporation Confidential

Queries under test: hand-made structs

pgjdbc/ubenchmark/InsertBatch.java

insert into batch_perf_test select * from unnest('{"(1,s1,1)","(2,s2,2)", "(3,s3,3)"}'::batch_perf_test[])

56© 2016 NetCracker Technology Corporation Confidential

You’d better use batch, your C.O.

16 128 10240

50

100

150

216

128

InsertBatchStructCopy

The number of inserted rows

fa

ster

, ms

int4, varchar, int4

57© 2016 NetCracker Technology Corporation Confidential

COPY is good

16 128 10240

0.51

1.52

2.5

BatchStructCopy

The number of inserted rows

Fa

ster

, ms

int4, varchar, int4

58© 2016 NetCracker Technology Corporation Confidential

COPY is bad for small batches

1 4 8 16 12805

10152025

BatchStructCopy

Batch size in rows

Fa

ster

, ms

Insert of 1024 rows

59© 2016 NetCracker Technology Corporation Confidential

Final thoughts

• PreparedStatement is our hero• Remember to EXPLAIN ANALYZE at least six

times, a blue moon is a plus• Don’t forget +0 and OFFSET 0

60© 2016 NetCracker Technology Corporation Confidential

About me

• Vladimir Sitnikov, @VladimirSitnikv• Performance architect in NetCracker• 10 years of experience with Java/SQL• PgJDBC committer

© 2016 NetCracker Technology Corporation Confidential

Questions?

Vladimir Sitnikov,PgConf 2016

top related