reaching 1 billion rows / second - cybertec · reaching 1 billion rows / second hans-jürgen...

32
Reaching 1 billion rows / second Hans-Jürgen Schönig www.postgresql-support.de Hans-Jürgen Schönig www.postgresql-support.de

Upload: others

Post on 16-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Reaching 1 billion rows / second

Hans-Jürgen Schönigwww.postgresql-support.de

Hans-Jürgen Schönigwww.postgresql-support.de

Page 2: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Reaching a milestone

Hans-Jürgen Schönigwww.postgresql-support.de

Page 3: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Traditional PostgreSQL limitations

I Traditionally:I We could only use 1 CPU core per queryI Scaling was possible by running more than one query at a timeI Usually hard to do

Hans-Jürgen Schönigwww.postgresql-support.de

Page 4: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

PL/Proxy: The traditional way to do it

I PL/Proxy is a stored procedure language to scale out to shards.I Worked nicely for OLTP workloadsI Somewhat usable for analytics

I A LOT of manual work

Hans-Jürgen Schönigwww.postgresql-support.de

Page 5: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

PL/Proxy: The future

I Still ok for OLTPI Certainly not the way to scale out in the futureI Too much manual workI Not transparentI Not cool enough

Hans-Jürgen Schönigwww.postgresql-support.de

Page 6: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

On the app level

I Doing scaling on the app levelI A lot of manual workI Not cool enoughI Needs a lot of developmentI Why use a database if work is still manual?

I Solving things on the app level is certainly not an option

Hans-Jürgen Schönigwww.postgresql-support.de

Page 7: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

The goal

Hans-Jürgen Schönigwww.postgresql-support.de

Page 8: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Our goal on the PostgreSQL level

I Import massive amounts of dataI Run typical aggregatesI Process 1 billion rows in less than a secondI Scale out to as many nodes as needed

Hans-Jürgen Schönigwww.postgresql-support.de

Page 9: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Coming up with a data structure

I We tried to keep that simple:

node=# \d t_demoTable "public.t_demo"

Column | Type | Collation | Nullable |--------+---------+-----------+----------+id | serial | | not null |grp | integer | | |data | real | | |

Indexes:"idx_id" btree (id)

Hans-Jürgen Schönigwww.postgresql-support.de

Page 10: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

The query

SELECT grp, count(data)FROM t_demoGROUP BY 1;

Hans-Jürgen Schönigwww.postgresql-support.de

Page 11: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Single server performance

Hans-Jürgen Schönigwww.postgresql-support.de

Page 12: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Tweaking a simple server

I The main questions are:I How much can we expect from a single server?I How well does it scale with many CPUs?I How far can we get?

Hans-Jürgen Schönigwww.postgresql-support.de

Page 13: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

PostgreSQL parallelism

I Parallel queries have been added in PostgreSQL 9.6I It can do a lotI It is by far not feature complete yet

I Number of workers will be determined by the PostgreSQLoptimizer

I We do not want thatI We want ALL cores to be at work

Hans-Jürgen Schönigwww.postgresql-support.de

Page 14: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Adjusting CPU core usage

I Usually the number of processes per scan is derived from thesize of the table

test=# SHOW min_parallel_relation_size ;min_parallel_relation_size

----------------------------8MB

(1 row)

I One process is added if the tablesize triples

Hans-Jürgen Schönigwww.postgresql-support.de

Page 15: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Overruling the planner

I We could never have enough data to make PostgreSQL go for16 or 32 cores.

I Even if the value is set to a couple of kilobytes.I The default mechanism can be overruled:

test=# ALTER TABLE t_demoSET (parallel_workers = 32);

ALTER TABLE

Hans-Jürgen Schönigwww.postgresql-support.de

Page 16: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Making full use of cores

I How well does PostgreSQL scale on a single box?I For the next test we assume that I/O is not an issue

I If I/O does not keep up, CPU does not make a differenceI Make sure that data can be read fast enough.

I Observation: 1 SSD might not be enough to feed a modernIntel chip

Hans-Jürgen Schönigwww.postgresql-support.de

Page 17: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Single node scalability (1)

{width=80% }

Hans-Jürgen Schönigwww.postgresql-support.de

Page 18: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Single node scalability (2)

I We used a 16 core box hereI As you can see, the query scales up nicelyI Beyond 16 cores hyperthreading kicks in

I We managed to gain around 18%

Hans-Jürgen Schönigwww.postgresql-support.de

Page 19: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Single node scalability (3)

I On a single Google VM we could reach close to 40 million rows/ second

I For many workloads this is already more than enoughI Rows / sec will of course depend on type of query

Hans-Jürgen Schönigwww.postgresql-support.de

Page 20: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Moving on to many nodes

Hans-Jürgen Schönigwww.postgresql-support.de

Page 21: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

The basic system architecture (1)

I We want to shard data to as many nodes as neededI For the demo: Place 100 million rows on each node

I We do so to eliminate the I/O bottleneckI In case I/O happens we can always compensate using more

servers

I Use parallel queries on each shard

Hans-Jürgen Schönigwww.postgresql-support.de

Page 22: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Testing with two nodes (1)

explain SELECT grp, COUNT(data) FROM t_demo GROUP BY 1;Finalize HashAggregate

Group Key: t_demo.grp-> Append

-> Foreign Scan (partial aggregate)-> Foreign Scan (partial aggregate)-> Partial HashAggregate

Group Key: t_demo.grp-> Seq Scan on t_demo

Hans-Jürgen Schönigwww.postgresql-support.de

Page 23: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Testing with two nodes (2)

I Throughput doubles as long as partial results are smallI Planner pushes down stuff nicelyI Linear increases are necessary to scale to 1 billion rows

Hans-Jürgen Schönigwww.postgresql-support.de

Page 24: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Preconditions to make it work (1)

I postgres_fdw uses cursors on the remote sideI cursor_tuple_fraction has to be set to 1 to improve the

planning processI set fetch_size to a large value

I That is the easy part

Hans-Jürgen Schönigwww.postgresql-support.de

Page 25: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Preconditions to make it work (2)

I We have to make sure that all remote nodes work at the sametime

I This requires “parallel append and async fetching”I All queries are sent to the many nodes in parallelI Data can be fetched in parallel

Hans-Jürgen Schönigwww.postgresql-support.de

Page 26: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Preconditions to make it work (3)

I PostgreSQL could not be changed without substantial workbeing done recently

I Traditionally joins had to be done BEFORE aggregationI This is a showstopper for distributed aggregation because all the

data has to be fetched from the remote host before aggregation

I Kyotaro Horiguchi fixed, which made our work possibleI This was a HARD task !

Hans-Jürgen Schönigwww.postgresql-support.de

Page 27: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Preconditions to make it work (4)

I Easy tasks:I Aggregates have to be implemented to handle partial results

coming from shardsI Code is simple and available as extension

Hans-Jürgen Schönigwww.postgresql-support.de

Page 28: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Parallel execution on shards is now possible

I Dissect aggregationI Send partial queries to shards in parallelI Perform parallel execution on shardsI Add up data on main node

Hans-Jürgen Schönigwww.postgresql-support.de

Page 29: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Final results

node=# SELECT grp, count(data) FROM t_demo GROUP BY 1;grp | count

-----+-----------0 | 3200000001 | 320000000

...9 | 320000000

(10 rows)Planning time: 0.955 msExecution time: 2910.367 ms

Hans-Jürgen Schönigwww.postgresql-support.de

Page 30: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Hardware used

I We used 32 boxes (16 cores) on GoogleI Data was in memoryI Adding more servers is EASY

Hans-Jürgen Schönigwww.postgresql-support.de

Page 31: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Future ideas

I JIT compilation will speed up executionI More parallelism for more executor nodesI General speedups (tuple deforming, etc.)I In the future FEWER cores will be needed to achieve similar

results

Hans-Jürgen Schönigwww.postgresql-support.de

Page 32: Reaching 1 billion rows / second - Cybertec · Reaching 1 billion rows / second Hans-Jürgen Schönig  Hans-Jürgen Schönig

Contact us

Cybertec Schönig & Schönig GmbHHans-Jürgen SchönigGröhrmühlgasse 26A-2700 Wiener Neustadt

www.postgresql-support.de

Follow us on Twitter: @PostgresSupport

Hans-Jürgen Schönigwww.postgresql-support.de