What happens when you execute a transaction in PostgreSQL?
The block will be fetched to shared buffers, executed there and upon commit, the dirty buffer will leave to disk again, but how does that happen?
The answer is here.
PostgreSQL is an ORDBMS software, Let’s ignore O for time being (we will have a separate post on O alone), as an RDBMS, PostgreSQL has to support ACID properties (we will have a seperate post to discuss ACID properties), but for now,
For example.,
1 2 3 4 |
begin; update A; update B; end; |
Here, until my transaction is complete except the session on which the query is fired, all the other sessions should see the old values only(Isolation) and both A and B should run together as they are there in a single transaction block(Atomicity).
PostgreSQL achieves this with MVCC architecture by implementing a concept called snapshots.
Buffer Life Cycle in PostgreSQL
- Pin and take Exclusive-lock on buffer containing the data page to be modified.
- Start the critical section which ensures that any error occur till End of critical section should be a PANIC as buffers might contain unlogged changes.
- Apply changes to buffer.
- Mark the buffer as dirty which ensures bgwriter (checkpoint) will flush this page and marking buffer dirty before writing log record ensures less contention for content latch of buffer.
- Build a record to be inserted in transaction log buffer.
- Update the Page with LSN which will be used by bgwriter or flush operation of page to ensure that corresponding log is flushed from buffer.
- End Critical section.
- Unlock and unpin the buffer.
As discussed, throughout the above life cycle PostgreSQL maintains a snapshot.
What happens when you execute below query in PostgreSQL?
1 2 3 4 5 |
begin; insert into a values('6','DEVOPS'); end; |
Begin marks the beginning of the query and as it is utility command, PostgreSQL will simply parses the begin and waits for the next command to start.
As soon as PostgreSQL gets, insert command, it creates a snapshot with xmin, xmax, and *xip details.
The following function is used for that.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
GetSnapshotData(Snapshot snapshot) { ProcArrayStruct *arrayP = procArray; TransactionId xmin; TransactionId xmax; TransactionId globalxmin; int index; int count = 0; int subcount = 0; bool suboverflowed = false; TransactionId replication_slot_xmin = InvalidTransactionId; TransactionId replication_slot_catalog_xmin = InvalidTransactionId; .. } |
The returned snapshot includes xmin (lowest still-running xact ID), xmax (highest completed xact ID + 1), and a list of running xact IDs in the range xmin <= xid < xmax.
It is used as follows:
- All xact IDs < xmin are considered finished.
- All xact IDs >= xmax are considered still running.
- For an xact ID xmin <= xid < xmax, consult list to see whether it is considered running or not.
When you commit the transaction, Update the Page with LSN which will be used by bgwriter or flush operation of page to ensure that corresponding log is flushed from buffer.
Example.,
Consider the following table
1 2 3 4 5 6 7 8 9 |
postgres=# SELECT *, xmin, xmax FROM a; id | dept | sal | xmin | xmax ----+------+-----+------+------ 3 | EEE | 100 | 592 | 0 4 | IT | 100 | 592 | 0 5 | ECE | 100 | 592 | 0 2 | CSE | 200 | 594 | 0 1 | DBA | 200 | 648 | 649 (5 rows) |
Now, Let me insert a record into table.
1 2 3 4 5 |
postgres=# begin; BEGIN postgres=# insert into a values('6','DEVOPS'); INSERT 0 1 |
I have not committed my transaction and the following snapshot created in my database.
2020-02-19 14:35:01.626 IST [18632] DEBUG: snapshot of 1+0 running transaction ids (lsn 0/948C0A8 oldest xid 650 latest complete 649 next xid 651)
My table is now got the xid 650 as its xmin.
1 2 3 4 5 6 7 8 9 |
postgres=# SELECT *, xmin, xmax FROM a; id | dept | sal | xmin | xmax ----+--------+-----+------+------ 3 | EEE | 100 | 592 | 0 4 | IT | 100 | 592 | 0 5 | ECE | 100 | 592 | 0 2 | CSE | 200 | 594 | 0 1 | DBA | 200 | 648 | 649 6 | DEVOPS | | 650 | 0 |
when I ran two transactions at a time and rollback the first initiated transaction, this is what happened.
Session A (xid 653)
1 2 3 4 |
postgres=# begin; BEGIN postgres=# insert into a values('10','Ns'); INSERT 0 1 |
Session B (xid 654)
1 2 3 4 |
postgres=# begin; BEGIN postgres=# insert into a values('11','PS'); INSERT 0 1 |
Log generated was,
snapshot of 0+0 running transaction ids (lsn 0/948C698 oldest xid 653 latest complete 652 next xid 653)
snapshot of 1+0 running transaction ids (lsn 0/948C718 oldest xid 653 latest complete 652 next xid 654)
Now after I commit xid 654 and rollback xid 653, My table was like
1 2 3 4 5 6 7 8 9 10 11 12 |
postgres=# SELECT *, xmin, xmax FROM a; id | dept | sal | xmin | xmax ----+--------+-----+------+------ 3 | EEE | 100 | 592 | 0 4 | IT | 100 | 592 | 0 5 | ECE | 100 | 592 | 0 2 | CSE | 200 | 594 | 0 1 | DBA | 200 | 648 | 649 6 | DEVOPS | | 650 | 0 7 | SEVEN | | 651 | 0 7 | SEVEN | | 652 | 0 9 | NINE | | 654 | 0 |
The transaction 653 gone from here. This is how my PostgreSQL achieves transaction isolation and transaction atomicity (though this post is not about ACID properties)
There are many new parameters that have come here, as I can’t discuss them in one post, please look up this space for more detailed understanding of some technical terms that I used here.
Thank you for giving your valuable time to read the above information. I hope the content served your purpose in reaching out to the blog.
Suggestions for improvement of the blog are highly appreciable. Please contact us for any information/suggestions/feedback.
If you want to be updated with all our articles
please follow us on Facebook | Twitter
Please subscribe to our newsletter.