inside postgresql shared memory
DESCRIPTION
This presentation is for people who want to understand how PostgreSQL shares information among processes using shared memory. Topics covered include the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures.TRANSCRIPT
![Page 1: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/1.jpg)
Inside PostgreSQL Shared Memory
BRUCE MOMJIAN,ENTERPRISEDB
January, 2009
AbstractPOSTGRESQL is an open-source, full-featured relational database.This presentation gives an overview of the shared memorystructures used by Postgres.
Creative Commons Attribution License http://momjian.us/presentations
![Page 2: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/2.jpg)
Outline
1. File storage format
2. Shared memory creation
3. Shared buffers
4. Row value access
5. Locking
6. Other structures
Inside PostgreSQL Shared Memory 1
![Page 3: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/3.jpg)
File System /data
Postgres
Postgres
Postgres
/data
Inside PostgreSQL Shared Memory 2
![Page 4: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/4.jpg)
File System /data/base
Postgres
Postgres
Postgres
/data
/pg_clog/pg_multixact/pg_subtrans/pg_tblspc
/pg_xlog
/global
/pg_twophase
/base
Inside PostgreSQL Shared Memory 3
![Page 5: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/5.jpg)
File System /data/base/db
Postgres
Postgres
Postgres
/data /base /16385 (production)
/1 (template1)
/17982 (devel)/16821 (test)
/21452 (marketing)
Inside PostgreSQL Shared Memory 4
![Page 6: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/6.jpg)
File System /data/base/db/table
Postgres
Postgres
Postgres
/data /base /16385 /24692 (customer)
/27214 (order)/25932 (product)/25952 (employee)/27839 (part)
Inside PostgreSQL Shared Memory 5
![Page 7: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/7.jpg)
File System Data Pages
Postgres
Postgres
Postgres
8k 8k 8k 8k
/data /base /16385 /24692
Inside PostgreSQL Shared Memory 6
![Page 8: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/8.jpg)
Data Pages
Postgres
Postgres
Postgres
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k 8k
/data /base /16385 /24692
Inside PostgreSQL Shared Memory 7
![Page 9: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/9.jpg)
File System Block Tuple
Postgres
Postgres
Postgres
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k 8k
/data /base /16385 /24692
Tuple
Inside PostgreSQL Shared Memory 8
![Page 10: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/10.jpg)
File System Tuple
hoff − length of tuple header
infomask − tuple flags
natts − number of attributes
ctid − tuple id (page / item)
cmax − destruction command id
xmin − creation transaction id
xmax − destruction transaction id
cmin − creation command id
bits − bit map representing NULLs
OID − object id of tuple (optional)
Tuple
Value Value ValueValue Value Value ValueHeader
int4in(’9241’)
textout()
’Martin’
Inside PostgreSQL Shared Memory 9
![Page 11: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/11.jpg)
Tuple Header C Structures
typedef struct HeapTupleFields{ TransactionId t_xmin; /* inserting xact ID */ TransactionId t_xmax; /* deleting or locking xact ID */
union { CommandId t_cid; /* inserting or deleting command ID, or both */ TransactionId t_xvac; /* VACUUM FULL xact ID */ } t_field3;} HeapTupleFields;
typedef struct HeapTupleHeaderData{ union { HeapTupleFields t_heap; DatumTupleFields t_datum; } t_choice;
ItemPointerData t_ctid; /* current TID of this or newer tuple */
/* Fields below here must match MinimalTupleData! */
uint16 t_infomask2; /* number of attributes + various flags */
uint16 t_infomask; /* various flag bits, see below */
uint8 t_hoff; /* sizeof header incl. bitmap, padding */
/* ^ − 23 bytes − ^ */
bits8 t_bits [ 1] ; /* bitmap of NULLs −− VARIABLE LENGTH */
/* MORE DATA FOLLOWS AT END OF STRUCT */} HeapTupleHeaderData;
Inside PostgreSQL Shared Memory 10
![Page 12: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/12.jpg)
Shared Memory Creation
postmaster postgres postgres
Program (Text)
Data
Program (Text)
Data
Shared Memory
Program (Text)
Data
Shared Memory Shared Memory
Stack Stack Stack
fork()
Inside PostgreSQL Shared Memory 11
![Page 13: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/13.jpg)
Shared Memory
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Lock Hashes
Auto Vacuum
Btree Vacuum
Free Space Map
Buffer Descriptors
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Inside PostgreSQL Shared Memory 12
![Page 14: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/14.jpg)
Shared Buffers
Page Header Item Item Item
Tuple
Tuple Tuple Special
8KPostgres
Postgres
Postgres
8k 8k 8k 8k
/data /base /16385 /24692
Shared Buffers
LWLock − for page changes
Pin Count − prevent page replacement
read()
write()
8k 8k 8k
Buffer Descriptors
Inside PostgreSQL Shared Memory 13
![Page 15: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/15.jpg)
HeapTuples
Shared Buffers
PostgresHeapTuple
C pointer
hoff − length of tuple header
infomask − tuple flags
natts − number of attributes
ctid − tuple id (page / item)
cmax − destruction command id
xmin − creation transaction id
xmax − destruction transaction id
cmin − creation command id
bits − bit map representing NULLs
OID − object id of tuple (optional)
Tuple
Value Value ValueValue Value Value ValueHeader
int4in(’9241’)
textout()
’Martin’
Page Header Item Item Item
Tuple
Tuple Tuple Special
8K
8k 8k 8k
Inside PostgreSQL Shared Memory 14
![Page 16: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/16.jpg)
Finding A Tuple Value in CDatumnocachegetattr ( HeapTuple tuple, int attnum, TupleDesc tupleDesc, bool * isnull ){ HeapTupleHeader tup = tuple −>t_data; Form_pg_attribute * att = tupleDesc −>attrs;
{ int i;
/* * Note − This loop is a little tricky. For each non−null attribute, * we have to first account for alignment padding before the attr, * then advance over the attr based on its length. Nulls have no * storage and no alignment padding either. We can use/set * attcacheoff until we reach either a null or a var−width attribute. */ off = 0; for ( i = 0;; i ++) /* loop exit is at "break" */ { if ( HeapTupleHasNulls ( tuple ) && att_isnull ( i, bp )) continue; /* this cannot be the target att */
if ( att [ i ]−> attlen == −1) off = att_align_pointer ( off, att [ i ]−> attalign, −1, tp + off ) ; else /* not varlena, so safe to use att_align_nominal */ off = att_align_nominal ( off, att [ i ]−> attalign ) ;
if ( i == attnum ) break;
off = att_addlength_pointer ( off, att [ i ]−> attlen, tp + off ) ; } }
return fetchatt ( att [ attnum ] , tp + off ) ;}
Inside PostgreSQL Shared Memory 15
![Page 17: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/17.jpg)
Value Access in C
#define fetch_att(T,attbyval,attlen) \( \ (attbyval) ? \ ( \ (attlen) == (int) sizeof(int32) ? \ Int32GetDatum(*((int32 *)(T))) \ : \ ( \ (attlen) == (int) sizeof(int16) ? \ Int16GetDatum(*((int16 *)(T))) \ : \ ( \ AssertMacro((attlen) == 1), \ CharGetDatum(*((char *)(T))) \ ) \ ) \ ) \ : \ PointerGetDatum((char *) (T)) \)
Inside PostgreSQL Shared Memory 16
![Page 18: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/18.jpg)
Test And Set LockCan Succeed Or Fail
0/1
1
0
Success
Was 0 on exchange
Lock already taken
Was 1 on exchange
Failure
1
1
Inside PostgreSQL Shared Memory 17
![Page 19: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/19.jpg)
Test And Set Lockx86 Assembler
static __inline__ inttas ( volatile slock_t * lock ){ register slock_t _res = 1;
/* * Use a non−locking test before asserting the bus lock. Note that the * extra test appears to be a small loss on some x86 platforms and a small * win on others; it’s by no means clear that we should keep it. */ __asm__ __volatile__ ( " cmpb $0,%1 \n" " jne 1f \n" " lock \n" " xchgb %0,%1 \n" "1: \n": "+q"( _res ) , "+m"(* lock ):: "memory", "cc") ; return ( int) _res;}
Inside PostgreSQL Shared Memory 18
![Page 20: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/20.jpg)
Spin LockAlways Succeeds
0/1
1
Success
Was 0 on exchange
Failure
Was 1 on exchange
Lock already taken
Sleep of increasing duration
0 1
1
Spinlocks are designed for short-lived locking operations, like access tocontrol structures. They are not be used to protect code that makeskernel calls or other heavy operations.Inside PostgreSQL Shared Memory 19
![Page 21: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/21.jpg)
Light Weight Locks
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Lock Hashes
Auto Vacuum
Btree Vacuum
Free Space Map
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Buffer Descriptors
Sleep On Lock
Light weight locks attempt to acquire the lock, and go to sleep on asemaphore if the lock request fails. Spinlocks control access to the lightweight lock control structure.
Inside PostgreSQL Shared Memory 20
![Page 22: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/22.jpg)
Database Object Locks
PROCLOCKPROC LOCK
Lock Hashes
Inside PostgreSQL Shared Memory 21
![Page 23: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/23.jpg)
Proc
Proc Array
PROC
used usedusedempty empty empty
Inside PostgreSQL Shared Memory 22
![Page 24: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/24.jpg)
Other Shared Memory Structures
Shared Buffers
Proc Array
PROC
Multi−XACT Buffers
Two−Phase Structs
Subtrans Buffers
CLOG Buffers
XLOG Buffers
Shared Invalidation
Lightweight Locks
Lock Hashes
Auto Vacuum
Btree Vacuum
Free Space Map
Buffer Descriptors
Background Writer Synchronized Scan
Semaphores
Statistics
LOCK
PROCLOCK
Inside PostgreSQL Shared Memory 23
![Page 25: Inside PostgreSQL Shared Memory](https://reader034.vdocuments.mx/reader034/viewer/2022042518/554d886eb4c905390c8b5316/html5/thumbnails/25.jpg)
Conclusion
Pink Floyd: Wish You Were HereInside PostgreSQL Shared Memory 24