Наш ответ uber'у / Александр Коротков (postgres professional)

31
Наш ответ Uber’у Александр Коротков Postgres Professional

Upload: ontico

Post on 21-Jan-2018

127 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

НашответUber’уАлександрКоротковPostgresProfessional

Page 2: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Russian developers of PostgreSQL:Alexander Korotkov, Teodor Sigaev, Oleg Bartunov

▶ Speakers at PGCon, PGConf: 20+ talks▶ GSoC mentors▶ PostgreSQL commi ers (1+1 in progress)▶ Conference organizers▶ 50+ years of expertship: development, audit, consul ng▶ Postgres Professional co-founders

PostgreSQL CORE▶ Locale support▶ PostgreSQL extendability:

GiST(KNN), GIN, SP-GiST▶ Full Text Search (FTS)▶ NoSQL (hstore, jsonb)▶ Indexed regexp search▶ Create AM & Generic WAL▶ Table engines (WIP)

Extensions▶ intarray▶ pg_trgm▶ ltree▶ hstore

▶ plantuner▶ jsquery▶ RUM▶ imgsmlr

Alexander Korotkov Наш ответ Uber’у 2 / 31

Page 3: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Disclaimer

▶ I’m NOT a MySQL expert. I didn’t even touch MySQL since 2011...▶ This talk express my own opinion, not PostgreSQL communityposi on, not even Postgres Professional official posi on.

▶ Uber’s guys knows be er which database they should use.

Alexander Korotkov Наш ответ Uber’у 3 / 31

Page 4: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

What happened?

▶ Uber migrated from MySQL to PostgreSQL in 2012.▶ Uber migrated from PostgreSQL to MySQL in 2016.▶ PostgreSQL to MySQL migra on made a log of buzz in PostgreSQLcommunity.

Alexander Korotkov Наш ответ Uber’у 4 / 31

Page 5: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Why did it happen?

▶ Uber migrated from MySQL to PostgreSQL for “a bunch of reasons,but one of the most important was availability of PostGIS” ¹

▶ Uber migrated from PostgreSQL to MySQL “some of the drawbacksthey found with Postgres” ²

¹https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-mysql-to-postgresql²https://eng.uber.com/mysql-migration/

Alexander Korotkov Наш ответ Uber’у 5 / 31

Page 6: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber’s complaints to PostgreSQL

Uber claims following “PostgreSQL limita ons”:▶ Inefficient architecture for writes▶ Inefficient data replica on▶ Issues with table corrup on▶ Poor replica MVCC support▶ Difficulty upgrading to newer releases

Alexander Korotkov Наш ответ Uber’у 6 / 31

Page 7: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

PostgreSQL’s vs. InnoDB’s storage formats

Alexander Korotkov Наш ответ Uber’у 7 / 31

Page 8: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

PostgreSQL storage format

▶ Both primary and secondary indexes point to loca on (blkno, offset) of tuple (rowversion) in the heap.

▶ When tuple is moved to another loca on, all corresponding index tuples should beinserted to the indexes.

▶ Heap contains both live and dead tuples.▶ VACUUM cleans up dead tuples and corresponding index tuples in a bulk manner.

Alexander Korotkov Наш ответ Uber’у 8 / 31

Page 9: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Update in PostgreSQL

▶ New tuple is inserted to the heap, previous tuple is marked asdeleted.

▶ Index tuples poin ng to new tuple are inserted to all indexes.

Alexander Korotkov Наш ответ Uber’у 9 / 31

Page 10: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Heap-Only-Tuple (HOT) PostgreSQL

▶ When no indexed columns are updated and new version of row canfit the same page, then HOT is used and only heap is updated.

▶ Microvacuum can be used to free required space in the page for HOT.

Alexander Korotkov Наш ответ Uber’у 10 / 31

Page 11: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Update in MySQL

▶ Table rows are placed in the primary index itself. Updates are performed in-place.Old version of rows are placed to special segment (undo log).

▶ When secondary indexed column is updated, then new index tuple is inserted whileprevious index tuple is marked as deleted.

Alexander Korotkov Наш ответ Uber’у 11 / 31

Page 12: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Updates: InnoDB in comparison with PostgreSQL

Pro:▶ Update of few indexed columns is cheaper.▶ Update, which don’t touch indexed columns, doesn’t depend on pagefree space in the page

Cons:▶ Update of majority of indexed columns is more expensive.▶ Secondary index scan is slower.▶ Primary key update is disaster.

Alexander Korotkov Наш ответ Uber’у 12 / 31

Page 13: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber example for write-amplifica on in PostgreSQL

CREATE TABLE users (id SERIAL PRIMARY KEY,first TEXT,last TEXT,birth_year INTEGER);

CREATE INDEX ix_users_first_last ON users (first, last);CREATE INDEX ix_users_birth_year ON users (birth_year);

UPDATE users SET birth_year = 1986 WHERE id = 1;

1. Write the new row tuple to the tablespace2. Update the primary key index to add a record for the new tuple3. Update the (first, last) index to add a record for the new tuple4. Update the birth_year index to add a record for the new tuple5. Previous ac ons are protected by WAL log.

Alexander Korotkov Наш ответ Uber’у 13 / 31

Page 14: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber example for write-amplifica on: MySQL vs. PostgreSQL

Alexander Korotkov Наш ответ Uber’у 14 / 31

Page 15: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber example for write-amplifica on: MySQL vs. PostgreSQL

PostgreSQL1. Write the new row tuple to the

tablespace2. Insert new tuple to primary key index3. Insert new tuple to (first, last) index4. Insert new tuple to birth_year index5. Previous ac ons are protected by WAL

log.

MySQL1. Update row in-place2. Write old version of row to the rollback

segment3. Insert new tuple to birth_year index4. Mark old tuple of birth_year index as

obsolete5. Previous ac ons are protected by

innodb log6. Write update record to binary log

Assuming we have replica on turned on

Alexander Korotkov Наш ответ Uber’у 15 / 31

Page 16: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Pending patches: WARM (write-amplifica on reduc onmethod)

▶ Behaves like HOT, but works also when some of index columns areupdated.

▶ New index tuples are inserted only for updated index columns.https://www.postgresql.org/message-id/flat/20170110192442.ocws4pu5wjxcf45b%40alvherre.pgsql

Alexander Korotkov Наш ответ Uber’у 16 / 31

Page 17: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Pending patches: indirect indexes

▶ Indirect indexes are indexes which points to primary key value instead of pointer toheap.

▶ Indirect index is not updates un l corresponding column is updated.

https://www.postgresql.org/message-id/[email protected] Korotkov Наш ответ Uber’у 17 / 31

Page 18: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Ideas: RDS (recently dead store)

▶ Recently dead tuples (deleted but visible for some transac ons) aredisplaced into special storage: RDS.

▶ Heap tuple headers are le in the heap.Alexander Korotkov Наш ответ Uber’у 18 / 31

Page 19: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Idea: undo log

▶ Displace old version of rows to undo log.▶ New index tuples are inserted only for updated index columns. Old index tuples are

marked as expired.▶ Move row to another page if new version doesn’t fit the page.

https://www.postgresql.org/message-id/flat/CA%2BTgmoZS4_CvkaseW8dUcXwJuZmPhdcGBoE_

GNZXWWn6xgKh9A%40mail.gmail.comAlexander Korotkov Наш ответ Uber’у 19 / 31

Page 20: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Idea: pluggable table engines

Owns▶ Ways to scan and modify tables.▶ Access methods implementa ons.

Shares▶ Transac ons, snapshots.▶ WAL.

https://www.pgcon.org/2016/schedule/events/920.en.htmlAlexander Korotkov Наш ответ Uber’у 20 / 31

Page 21: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Types of replica on

▶ Statement-level – stream wri ng queries to the slave.▶ Row-level – stream updated rows to the slave.▶ Block-level – stream blocks and/or block deltas to the slave.

Alexander Korotkov Наш ответ Uber’у 21 / 31

Page 22: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Replica on types in PostgreSQL vs. MySQL

Replica on Type MySQL PostgreSQLStatement-level buil n pgPool-IIRow-level buil n pgLogical

LondisteSlony ...

Block-level N/A buil n

Alexander Korotkov Наш ответ Uber’у 22 / 31

Page 23: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber’s replica on comparison

▶ Uber compares MySQL replica on versus PostgreSQL replica on.▶ Actually, Uber compares MySQL row-level replica on versusPostgreSQL block-level replica on.

▶ That happened because that me PostgreSQL had buil n block-levelreplica on, but didn’t have buil n row-level replica on.Simultaneously, MySQL had buil n row-level replica on, but didn’thave buil n block-level replica on.

Alexander Korotkov Наш ответ Uber’у 23 / 31

Page 24: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Uber’s complaints to PostgreSQL block-level replica on

▶ Replica on stream transfers all the changes at block-level including“write-amplifica on”. Thus, it requires very high-bandwidth channel.In turn, that makes geo-distributed replica on harder.

▶ There are MVCC limita ons for read-only requires on replica. Apply ofVACUUM changes conflicts with read-only queries which could seethe data VACUUM is going to delete.

Alexander Korotkov Наш ответ Uber’у 24 / 31

Page 25: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Is row-level replica on superior over block-level replica on?

Alibaba works on adding block-level replica on to InnoDB. Zhai Weixiang,database developer from Alibaba considers following advantages ofblock-level replica on: ³

▶ Be er performance: higher throughput and lower response me▶ Write less data (turn off binary log and g d), and only one fsync to make

transac on durable▶ Less recovery me

▶ Replica on▶ Less replica on latency▶ Ensure data consistency (most important for some sensi ve clients)

³https://www.percona.com/live/data-performance-conference-2016/sessions/physical-replication-based-innodb

Alexander Korotkov Наш ответ Uber’у 25 / 31

Page 26: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Replica read-only query MVCC conflict with VACUUM

Possible op ons:▶ Delay the replica on,▶ Cancel read-only query on replica,▶ Provide a feedback to master about row versions which could be demanded.

Undo log would do be er, we wouldn’t have to choose...Alexander Korotkov Наш ответ Uber’у 26 / 31

Page 27: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

More about replica on and write-amplifica on

MySQL row-level replica on

PostgreSQL row-level replica on (pgLogical)

Alexander Korotkov Наш ответ Uber’у 27 / 31

Page 28: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Major version upgrade with pg_upgrade

Alexander Korotkov Наш ответ Uber’у 28 / 31

Page 29: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Major version upgrade with pgLogical

https://www.depesz.com/2016/11/08/major-version-upgrading-with-minimal-downtime/Alexander Korotkov Наш ответ Uber’у 29 / 31

Page 30: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Other Uber notes

▶ PostgreSQL 9.2 had data corrup on bug. It was fixed long me ago. Since that mePostgreSQL automated tests system was significantly improved to evade such bugs in future.

▶ pread is faster than seek + read. Thats really gives 1.5% accelera on on read-onlybenchmark. ⁴

▶ PostgreSQL advises to setup rela vely small shared_buffers and rely on OS cache, while“InnoDB storage engine implements its own LRU in something it calls the InnoDB bufferpool”. PostgreSQL also implements its own LRU in something it calls the shared buffers. Andyou can setup any shared buffers size.

▶ PostgreSQL uses mul process model. So, connec on is more expensive since unless youuse pgBouncer or other external connec on pool.

⁴https://www.postgresql.org/message-id/flat/a86bd200-ebbe-d829-e3ca-0c4474b2fcb7%40ohmu.fiAlexander Korotkov Наш ответ Uber’у 30 / 31

Page 31: Наш ответ Uber'у / Александр Коротков (Postgres Professional)

Thank you for a en on!

Alexander Korotkov Наш ответ Uber’у 31 / 31