mv unmasked.w.code.march.2013

91
1 © 2013 EnterpriseDB. All rights reserved. MVCC Unmasked March 2013 This presentation explains how Multiversion Concurrency Control (MVCC) is implemented in Postgres, and highlights optimizations which minimize the downsides of MVCC. http://momjian.us/presentations

Upload: enterprisedb

Post on 14-Jan-2015

557 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Mv unmasked.w.code.march.2013

1 © 2013 EnterpriseDB. All rights reserved.

MVCC Unmasked!

March 2013

This presentation explains how Multiversion Concurrency Control (MVCC) is implemented in Postgres, and highlights optimizations which minimize the downsides of MVCC.

http://momjian.us/presentations

Page 2: Mv unmasked.w.code.march.2013

Why Unmask MVCC?!u  Predict concurrent query behavior u  Manage MVCC performance effects u  Understand storage space reuse

2 © 2013 EnterpriseDB. All rights reserved.

Page 3: Mv unmasked.w.code.march.2013

u  Overview of EnterpriseDB – who are we and how are we involved in the PostgreSQL community

u  Introduction to MVCC u  (MVCC Implementation) u  MVCC Clean-up Requirements & Behavior u  Available PostgreSQL Resources

3 © 2013 EnterpriseDB. All rights reserved.

Outline!

Page 4: Mv unmasked.w.code.march.2013

Who is EnterpriseDB?!

4

u  Largest commercial entity behind PostgreSQL!

u  Enables open-source cost benefits along with the products, resources and support of a full commercial database company!

u  Enhanced Postgres Plus products for enterprise-class applications:!•  Security!•  Performance!•  Oracle compatibility!•  Monitoring, management & replication tools!

u  Global provider for all your PostgreSQL needs!!

© 2013 EnterpriseDB. All rights reserved.

Page 5: Mv unmasked.w.code.march.2013

What is MVCC?! Multiversion Concurrency Control (MVCC) allows Postgres to offer high concurrency even during significant database read/write activity. MVCC specifically offers behavior where “readers never block writers, and writers never block readers.” This presentation explains how MVCC is implemented in Postgres and highlights optimizations which minimize the downsides of MVCC.

5 © 2013 EnterpriseDB. All rights reserved.

Page 6: Mv unmasked.w.code.march.2013

Which other databases support MVCC?!u  Oracle u  DBs (partial) u  MySQL with InnoDB u  Informix u  Firebird u  MSSQL (optional, disabled by default)

6 © 2013 EnterpriseDB. All rights reserved.

Page 7: Mv unmasked.w.code.march.2013

MVCC Behavior!

7 © 2013 EnterpriseDB. All rights reserved.

Cre 40 Exp

Cre 40 Exp 47

Cre 64 Exp 78

Cre 78 Exp

INSERT

DELETE

UPDATE

old (delete)

new (insert)

Page 8: Mv unmasked.w.code.march.2013

MVCC Snapshots!MVCC snapshots which tuples are visible for SQL statements. A snapshot is recorded at the start of each SQL statement in READ COMMITTED transaction isolation mode, and at transaction start in SERIALIZABLE transaction isolation mode. In fact, it is frequency of taking new snapshots that controls the transactions isolation behavior. When a new snapshot is taken, the following information is gathered:

•  The highest-numbered committed transaction !•  The transaction numbers currently executing!

Using this snapshot information, Postgres can determine if a transaction’s actions should be visible to an executing statement.

8 © 2013 EnterpriseDB. All rights reserved.

Page 9: Mv unmasked.w.code.march.2013

MVCC Snapshots Determine Row Visibility!

9 © 2013 EnterpriseDB. All rights reserved.

Visible Invisible Invisible

Cre 30 Exp Cre 50 Exp Cre 110 Exp

Create-Only

Cre 30 Exp 80

Cre 30 Exp 75

Cre 30 Exp 110

Invisible Visible Visible

Sequential Scan

Snapshot The highest-numbered committed transaction: 100 Open Transactions: 25, 50, 75 For simplicity, assume all other transactions are committed.

Internally, the creation xid is stored in the system column ‘xmin’, and expire in ‘xmax’.

Page 10: Mv unmasked.w.code.march.2013

Confused Yet?!Source code comment in src/backend/utils/time/tqal.c:

10 © 2013 EnterpriseDB. All rights reserved.

Mao [Mike Olson] says 17 march 1993: the tests in this routine are correct; if you think they’re not, you’re wrong, and you should think about it again. i know, it happened to me.

Page 11: Mv unmasked.w.code.march.2013

Implementation Details! All queries were generated on an unmodified version of Postgres. The contrib module pageinspect was installed to show internal heap page information and pg_freespacemap was installed to show free space map information.

11 © 2013 EnterpriseDB. All rights reserved.

Page 12: Mv unmasked.w.code.march.2013

Setup!

12 © 2013 EnterpriseDB. All rights reserved.

Page 13: Mv unmasked.w.code.march.2013

Setup!

13 © 2013 EnterpriseDB. All rights reserved.

Page 14: Mv unmasked.w.code.march.2013

Insert Using Xmin!

14 © 2013 EnterpriseDB. All rights reserved.

All the queries used in this presentation are available at http://momjian.us/main/writings/pgsql/mvcc.sql!

Page 15: Mv unmasked.w.code.march.2013

Just Like INSERT!

15 © 2013 EnterpriseDB. All rights reserved.

Cre 40 Exp

Cre 40 Exp 47

Cre 64 Exp 78

Cre 78 Exp

INSERT

DELETE

UPDATE

old (delete)

new (insert)

Page 16: Mv unmasked.w.code.march.2013

DELETE USING Xmax!

16 © 2013 EnterpriseDB. All rights reserved.

Page 17: Mv unmasked.w.code.march.2013

DELETE Using Xmax!

17 © 2013 EnterpriseDB. All rights reserved.

Page 18: Mv unmasked.w.code.march.2013

Just Like DELETE!

18 © 2013 EnterpriseDB. All rights reserved.

Cre 40 Exp

Cre 40 Exp 47

Cre 64 Exp 78

Cre 78 Exp

INSERT

DELETE

UPDATE

old (delete)

new (insert)

Page 19: Mv unmasked.w.code.march.2013

UPDATE Using Xmin and Xmax!

19 © 2013 EnterpriseDB. All rights reserved.

Page 20: Mv unmasked.w.code.march.2013

UPDATE Using Xmin and Xmax!

20 © 2013 EnterpriseDB. All rights reserved.

Page 21: Mv unmasked.w.code.march.2013

Just Like UPDATE!

21 © 2013 EnterpriseDB. All rights reserved.

Cre 40 Exp

Cre 40 Exp 47

Cre 64 Exp 78

Cre 78 Exp

INSERT

DELETE

UPDATE

old (delete)

new (insert)

Page 22: Mv unmasked.w.code.march.2013

Aborted Transaction IDs Remain!

22 © 2013 EnterpriseDB. All rights reserved.

Page 23: Mv unmasked.w.code.march.2013

Aborted IDs can remain because Transaction Status is recorded centrally!

Pg_clog XID Status flags

23 © 2013 EnterpriseDB. All rights reserved.

0 0 0 1 0 0 1 0

1 0 1 0 0 0 0 0

1 0 1 0 0 1 0 0

0 0 0 0 0 0 1 1

0 0 0 1 0 1 1 0

1 0 1 0 0 0 1 0

1 0 1 0 0 0 0 0

1 0 0 1 0 0 1 0

028

020

024

012

016

008

000

004

Transaction roll back marks the transaction ID as aborted. All sessions will ignore such transactions; it is not necessary to revisit each row to undo the transaction

Page 24: Mv unmasked.w.code.march.2013

Row Locks Using Xmax!

24 © 2013 EnterpriseDB. All rights reserved.

Page 25: Mv unmasked.w.code.march.2013

Row Locks Using Xmax!

25 © 2013 EnterpriseDB. All rights reserved.

Page 26: Mv unmasked.w.code.march.2013

Multi-Statement Transactions! Multi-statement transactions require extra tracking because each statement has its own visibility rules. For example, a cursor’s contents must remain unchanged even if later statements in the same transaction modify rows. Such tracking is implemented using system command id columns cmin/cmax, which is internally actually is a single column.

26 © 2013 EnterpriseDB. All rights reserved.

Page 27: Mv unmasked.w.code.march.2013

INSERT Using Cmin!

27 © 2013 EnterpriseDB. All rights reserved.

Page 28: Mv unmasked.w.code.march.2013

INSERT Using Cmin!

28 © 2013 EnterpriseDB. All rights reserved.

Page 29: Mv unmasked.w.code.march.2013

DELETE Using Cmin!

29 © 2013 EnterpriseDB. All rights reserved.

Page 30: Mv unmasked.w.code.march.2013

DELETE Using Cmin!

30 © 2013 EnterpriseDB. All rights reserved.

Page 31: Mv unmasked.w.code.march.2013

DELETE Using Cmin! A cursor had to be used because the rows were created and deleted in this transaction and therefore never visible outside this transaction.

31 © 2013 EnterpriseDB. All rights reserved.

Page 32: Mv unmasked.w.code.march.2013

Update Using Cmin!

32 © 2013 EnterpriseDB. All rights reserved.

Page 33: Mv unmasked.w.code.march.2013

Update Using Cmin!

33 © 2013 EnterpriseDB. All rights reserved.

Page 34: Mv unmasked.w.code.march.2013

Modifying Rows from Different Transactions!

34 © 2013 EnterpriseDB. All rights reserved.

Page 35: Mv unmasked.w.code.march.2013

Modifying Rows from Different Transactions!

35 © 2013 EnterpriseDB. All rights reserved.

Page 36: Mv unmasked.w.code.march.2013

Modifying Rows from Different Transactions!

36 © 2013 EnterpriseDB. All rights reserved.

Page 37: Mv unmasked.w.code.march.2013

Combo Command Id!

Because cmin and cmax are internally a single system column, it is impossible to simply record the status of a row that is created and expired in the same multi-statement transaction. For that reason, a special combo command id is created that references a local memory hash that contains the actual cmin and cmax values.

37 © 2013 EnterpriseDB. All rights reserved.

Page 38: Mv unmasked.w.code.march.2013

Update Using Combo Command Ids!

38 © 2013 EnterpriseDB. All rights reserved.

Page 39: Mv unmasked.w.code.march.2013

UPDATE Using Combo Command Ids!

39 © 2013 EnterpriseDB. All rights reserved.

Page 40: Mv unmasked.w.code.march.2013

Update Using Combo Command Ids!

40 © 2013 EnterpriseDB. All rights reserved.

Page 41: Mv unmasked.w.code.march.2013

Update Using Combo Command Ids! The last query uses /contrib/pageinspect, which allows visibility of internal heap page structures and all stored rows, including those not visible in the current snapshot. (Bit 0x0020 is internally called HEAP_COMBOCID.)

41 © 2013 EnterpriseDB. All rights reserved.

Page 42: Mv unmasked.w.code.march.2013

MVCC Implementation Summary! xmin: creation transaction number, set by INSERT and

UPDATE xmax: expire transaction number, set by UPDATE and

DELETE; also used for explicit row locks cmin/cmax: used to identify the command number that created or

expired the tuple; also used to store combo command ids when the tuple is created and expired in the same transaction, and for explicit row locks

42 © 2013 EnterpriseDB. All rights reserved.

Page 43: Mv unmasked.w.code.march.2013

Traditional Cleanup Requirements! Traditional single-row-version (non-MVCC) database systems require storage space cleanup: u  deleted rows u  rows created by aborted transactions

43 © 2013 EnterpriseDB. All rights reserved.

Page 44: Mv unmasked.w.code.march.2013

MVCC Cleanup Requirements! MVCC has additional cleanup requirements: u  The creation of a new row during UPDATE (rather than

replacing the existing row); the storage space taken by the old row must eventually be recycled.

u  The delayed cleanup of deleted rows (cleanup cannot occur until there are no transactions for which the row is visible)

Postgres handles both traditional and MVCC-specific cleanup requirements.

44 © 2013 EnterpriseDB. All rights reserved.

Page 45: Mv unmasked.w.code.march.2013

Cleanup Behavior! Fortunately, Postgres cleanup happens automatically: u  On-demand cleanup of a single heap page during row access,

specifically when a page is accessed by SELECT, UPDATE and DELETE

u  In bulk by an autovacuum processes that runs in the background

Cleanup can also be initiated manually by VACUUM.

45 © 2013 EnterpriseDB. All rights reserved.

Page 46: Mv unmasked.w.code.march.2013

Aspects of Cleanup! Cleanup involves recycling space taken by several entities: u  Heap tuples/rows (the largest) u  Heap item pointers (the smallest) u  Index entries

46 © 2013 EnterpriseDB. All rights reserved.

Page 47: Mv unmasked.w.code.march.2013

Internal Heap Page!

47 © 2013 EnterpriseDB. All rights reserved.

Page 48: Mv unmasked.w.code.march.2013

Indexes Points to Items, Not Tuples!

48 © 2013 EnterpriseDB. All rights reserved.

Page 49: Mv unmasked.w.code.march.2013

Heap Tuple Space Recycling! Indexes prevent item pointers from being recycled.

49 © 2013 EnterpriseDB. All rights reserved.

Page 50: Mv unmasked.w.code.march.2013

VACUUM Later Recycle Items! VACUUM performs index cleanup, then can mark “dead” items as “unused”.

50 © 2013 EnterpriseDB. All rights reserved.

Page 51: Mv unmasked.w.code.march.2013

Cleanup of Deleted Rows!

51 © 2013 EnterpriseDB. All rights reserved.

Page 52: Mv unmasked.w.code.march.2013

Cleanup Deleted Rows!

52 © 2013 EnterpriseDB. All rights reserved.

Page 53: Mv unmasked.w.code.march.2013

Cleanup of Deleted Rows! In normal, multi-user usage, cleanup might have been delayed because other open transactions in the same database might still need to view the expired rows. However, the behavior would be the same, just delays.

53 © 2013 EnterpriseDB. All rights reserved.

Page 54: Mv unmasked.w.code.march.2013

Cleanup of Deleted Rows!

54 © 2013 EnterpriseDB. All rights reserved.

Page 55: Mv unmasked.w.code.march.2013

Same as this Slide!

55 © 2013 EnterpriseDB. All rights reserved.

Page 56: Mv unmasked.w.code.march.2013

Cleanup of Deleted Rows!

56 © 2013 EnterpriseDB. All rights reserved.

Page 57: Mv unmasked.w.code.march.2013

Same as this Slide!

57 © 2013 EnterpriseDB. All rights reserved.

Page 58: Mv unmasked.w.code.march.2013

Free Space Map (FSM)! VACUUM also updates the free space map (FSM), which records pages containing significant free space. This information is used to provide target pages for INSERTs and some UPDATEs (those crossing page boundaries). Single-page vacuum does not update the free space map.

58 © 2013 EnterpriseDB. All rights reserved.

Page 59: Mv unmasked.w.code.march.2013

Another Free Space Map Example!

59 © 2013 EnterpriseDB. All rights reserved.

Page 60: Mv unmasked.w.code.march.2013

Another Free Space Map Example!

60 © 2013 EnterpriseDB. All rights reserved.

Page 61: Mv unmasked.w.code.march.2013

Another Free Space Map Example!

61 © 2013 EnterpriseDB. All rights reserved.

Page 62: Mv unmasked.w.code.march.2013

VACUUM Also Removes End-of-File Pages!

62 © 2013 EnterpriseDB. All rights reserved.

VACUUM FULL shrinks the table file to its minimum size, but requires an exclusive table lock

Page 63: Mv unmasked.w.code.march.2013

Optimized Single-Page Cleanup of Old UPDATE Rows!

The storage space taken by old UPDATE tuples can be reclaimed just like deleted rows. However, certain UPDATE rows can even have their items reclaimed, i.e. it is possible to reuse certain old UPDATE items, rath than making them as “dead” and requiring VACUUM to reclaim them after removing referencing index entries. Specifically, such item reuse is possible with special HOT update (head-only tuple) chains, where the chain is on a single heap page and all indexed values in the chain are identical.

63 © 2013 EnterpriseDB. All rights reserved.

Page 64: Mv unmasked.w.code.march.2013

Single-Page Cleanup of HOT UPDATE Rows!

HOT update items can be freed (marked “unused”) if they are in the middle of the chain, i.e. not at the beginning or end of the chain. At the head of the chain is a special “Redirect” item pointers that are referenced by indexes; this is possible because all indexed values are identical in a HOT/redirect chain. Index creation with HOT chains is complex because the chains might contain inconsistent values for the newly indexed columns. This is handled by indexing just the end of the HOT chain and allowing the index to be used only by transactions that start after the index has been created. (Specifically, post-index-creation transactions cannot see the inconsistent HOT chain values due to MVCC visibility rules; they only see the end of the chain.)

64 © 2013 EnterpriseDB. All rights reserved.

Page 65: Mv unmasked.w.code.march.2013

Initial Single-Row State!

65 © 2013 EnterpriseDB. All rights reserved.

Page 66: Mv unmasked.w.code.march.2013

UPDATE Adds a New Row!

No index entry added because indexes only point to the head of the HOT chain.

66 © 2013 EnterpriseDB. All rights reserved.

Page 67: Mv unmasked.w.code.march.2013

Redirect Allows Indexes To Remain Valid!

67 © 2013 EnterpriseDB. All rights reserved.

Page 68: Mv unmasked.w.code.march.2013

UPDATE Replaces Another Old Row!

68 © 2013 EnterpriseDB. All rights reserved.

Page 69: Mv unmasked.w.code.march.2013

All OLD UPDATE Row Versions Eventually Removed!

This cleanup was performed by another operation on the same page.

69 © 2013 EnterpriseDB. All rights reserved.

Page 70: Mv unmasked.w.code.march.2013

Cleanup of Old Updated Rows!

70 © 2013 EnterpriseDB. All rights reserved.

Page 71: Mv unmasked.w.code.march.2013

Cleanup of Old Updated Rows!

71 © 2013 EnterpriseDB. All rights reserved.

Page 72: Mv unmasked.w.code.march.2013

Cleanup of Old Updated Rows! No index entry added because indexes only point to the head of the HOT chain.

72 © 2013 EnterpriseDB. All rights reserved.

Page 73: Mv unmasked.w.code.march.2013

Same as this Slide!

73 © 2013 EnterpriseDB. All rights reserved.

Page 74: Mv unmasked.w.code.march.2013

Cleanup of Old Updated Rows!

74 © 2013 EnterpriseDB. All rights reserved.

Page 75: Mv unmasked.w.code.march.2013

Same as this Slide!

75 © 2013 EnterpriseDB. All rights reserved.

Page 76: Mv unmasked.w.code.march.2013

Cleanup of Old Updated Rows!

76 © 2013 EnterpriseDB. All rights reserved.

Page 77: Mv unmasked.w.code.march.2013

Same as this Slide!

77 © 2013 EnterpriseDB. All rights reserved.

Page 78: Mv unmasked.w.code.march.2013

VACUUM Does Not Remove the Redirect!

78 © 2013 EnterpriseDB. All rights reserved.

Page 79: Mv unmasked.w.code.march.2013

Cleanup Using Manual VACUUM!

79 © 2013 EnterpriseDB. All rights reserved.

Page 80: Mv unmasked.w.code.march.2013

Cleanup Using Manual VACUUM!

80 © 2013 EnterpriseDB. All rights reserved.

Page 81: Mv unmasked.w.code.march.2013

The Indexed UPDATE Problem! The updating of any indexed columns prevents the use of “redirect” items because the chain must be usable by all indexes, i.e. a redirect/HOT UPDATE cannot require additional index entries due to an indexed value change. In such cases, item pointers can only be marked as “dead”, like DELETE does. No previously shown UPDATE queries modified indexed columns.

81 © 2013 EnterpriseDB. All rights reserved.

Page 82: Mv unmasked.w.code.march.2013

Index mvcc_demo Column!

82 © 2013 EnterpriseDB. All rights reserved.

Page 83: Mv unmasked.w.code.march.2013

UPDATE of an Indexed Column!

83 © 2013 EnterpriseDB. All rights reserved.

Page 84: Mv unmasked.w.code.march.2013

UPDATE of an Indexed Column!

84 © 2013 EnterpriseDB. All rights reserved.

Page 85: Mv unmasked.w.code.march.2013

UPDATE of an Indexed Column!

85 © 2013 EnterpriseDB. All rights reserved.

Page 86: Mv unmasked.w.code.march.2013

UPDATE of an Indexed Column!

86 © 2013 EnterpriseDB. All rights reserved.

Page 87: Mv unmasked.w.code.march.2013

UPDATE of an Indexed Column!

87 © 2013 EnterpriseDB. All rights reserved.

Page 88: Mv unmasked.w.code.march.2013

Cleanup Summary! Cleanup is possible only when there are no active transactions for which the tuples are visible. HOT items are UPDATE chains that span a single page and contain identical indexed column values. In normal usage, single-page cleanup performs the majority of the unnecessary index entries, and updates the free space map (FSM).

88 © 2013 EnterpriseDB. All rights reserved.

Page 89: Mv unmasked.w.code.march.2013

Conclusion! All the queries used in this presentation are available at: http://momjian.us/main/writings/pgsql/mvcc.sql http://momjian.us/main/presentations/

89 © 2013 EnterpriseDB. All rights reserved.

Page 90: Mv unmasked.w.code.march.2013

Postgres Resources!u  Upcoming training and events

•  Postgres training, NY – March 21st !–  http://pgday.nycpug.org/training/ !

•  PGDay, NY – March 22nd!

–  http://pgday.nycpug.org/!•  PGCon, Ottawa, CA – May 21-24!

–  http://www.pgcon.org/2013/!

u  EnterpriseDB live and on-demand training •  http://www.enterprisedb.com/store/products/dba-training !

u  Resources & Community •  Whitepapers, datasheets, podcasts, videos, community!•  http://www.enterprisedb.com/resources-community !

u  Postgres products, support and services •  [email protected] or visit www.enterprisedb.com !

Page 91: Mv unmasked.w.code.march.2013

What are your Questions??!

© 2013 EnterpriseDB. All rights reserved.