Happy day everyone!!
As we all know, PostgreSQL highly interacts with the operating system for the operations that it does on the database.
Now, before we deep dive into the nitty gritty of the case study that I have done with the Operating System’s “system calls” and “kill signals” on PostgreSQL background processes, let’s understand what’s there inside.
Table of contents
1. Linux basics to understand PostgreSQL processes.
2. PostgreSQL and Linux correlation.
3. Case study.
Table of Contents
Linux basics to understand PostgreSQL processes.
What is the process ?
A process is an executing instance of a program. It carries out tasks within the operating system.
Each process is allocated a unique number, called the process identifier (PID). It’s an integer between 2 and 32,768. When a process is started, the numbers restart from 2, and the number 1 is typically reserved for the init process. Process #1 manages other processes.
Process Table
The process table describes all the processes that are currently loaded. The ps command shows the processes. By default, it shows only processes that maintain a connection with a terminal, a console, a serial line, or a pseudo terminal.
To see all processes, we use -e option and -f to get full information.
(ps -ef) Example .,
postgres 28056 1 0 21:51 pts/0 00:00:01 /usr/local/pgsql/bin/postgres -D /u02/pgsql/data
Here., 1 is init process.
28056 is the child process.
System Call
A system call sometimes referred to as a kernel call, is a request in a Unix-like operating system made via a software interrupt by an active process for a service performed by the kernel.
fork() system call
We can create a new process by calling fork(). This system call duplicates the current process, creating a new entry in process table with many of the same attributes as the current process. nothing but, the newly created process will be the child of the calling process.
The PPID for the newly created process would be the parent’s PID.
Signals
The signal is a notification, a message sent by either operating system or some application to our program.
i) A signal may be sent from the kernel to a process
ii)From a process to another process
iii)From a process to itself.
Signal typically alerts a process to some event depending on the signal type.
Linux kernel implements about 30 signals. Each signal identified by a number, from 1 to 31.
Example., SIGKILL or signal number 9 tells the program that someone tries to kill it.
Now, Let’s correlate Linux processes and signals with the PostgreSQL database.
PostgreSQL and Linux correlation
When you start your PostgreSQL, the first process to start is your postmaster process. postmaster then creates multiple background processes.
If you check the relationships between processes with the pstree command, you can see that the Postmaster process is the parent process of all processes. When the postmaster starts it needs to know the location of the data area. The location must be specified by the -D option or the PGDATA environment variable
1 2 3 4 5 6 7 8 |
[postgres@asristgdb ~]$ ps -ef | grep postgres postgres 7913 1 0 23:11 pts/3 00:00:00 /usr/local/pgsql/bin/postgres -D /u02/pgsql/data postgres 7914 7913 0 23:11 ? 00:00:00 postgres: logger process postgres 7916 7913 0 23:11 ? 00:00:00 postgres: checkpointer process postgres 7917 7913 0 23:11 ? 00:00:00 postgres: writer process postgres 7918 7913 0 23:11 ? 00:00:00 postgres: wal writer process postgres 7919 7913 0 23:11 ? 00:00:00 postgres: autovacuum launcher process postgres 7920 7913 0 23:11 ? 00:00:00 postgres: stats collector process |
Here, the process “7913” is your postmaster process. It then created multiple child processes called PostgreSQL background processes.
As you can see, the PPID for all the background processes is your PID(7913) of the postmaster process.
Postmaster process is also responsible for creating a backend process for the client connection.
Also read, how a PostgreSQL connection is established
Connection request:
1 2 3 4 5 6 7 8 9 |
[postgres@asristgdb ~]$ psql -d testdb -U testuser psql (9.6.8) Type "help" for help. testdb=# Process Output: ps -ef | grep postgres (output truncated) postgres 7994 7913 0 23:11 ? 00:00:00 postgres: testuser testdb [local] idle |
we can also see the processes list with pstree
1 2 3 4 5 6 7 8 |
[postgres@asristgdb ~]$ pstree -p 7913 postgres(7913)¬postgres(7914) postgres(7916) postgres(7917) postgres(7918) postgres(7919) postgres(7920) [postgres@asristgdb ~]$ |
PostgreSQL and OS Signals
PostgreSQL uses the following Operating system signals when you make any call on the database processes.
1. ABRT
2. INT
3. QUIT
4. TERM
5. HUP
6. USR1
7. USR2
The following table describes the usage of each signal.
SIGNAL | USAGE |
---|---|
INT | external interrupt, usually initiated by the user., at client side it is the results of a Control-C which normally cancels a running program |
TERM | The SIGTERM signal is a generic signal used to cause program termination, equivalent to KILL PID.This is used for graceful termination of a process |
QUIT | It is a more forceful request. It shall terminate ungraceful, still cleaning up resources that absolutely need cleanup, but may not delete temporary files.This signal is generated when a user presses Ctrl+\ |
ABRT | SIGABRT causes abnormal program termination |
HUP | It is a hang up request, it is used to tell the process to reinitialize itself |
USR1 | User defined signal 1 |
USR2 | User defined signal 2 |
Case study
I sent all the signals that are mentioned in the above table to the postmaster and the other background processes and captured the result.
The following table shows the behavior of the system when I sent the KILL signals to the postmaster process.
Syntax: pg_ctl kill SIGNALNAME PID
Example., /usr/local/pgsql/bin/pg_ctl kill INT 20958
where 20958 is the PID for postmaster process.
SIGNAL | POSTMASTER PROCESS | LOG |
---|---|---|
INT | Fast Shutdown | LOG: received fast shutdown request LOG: aborting any active transactions LOG: autovacuum launcher shutting down LOG: shutting down LOG: checkpoint starting: shutdown immediate LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.026 s, sync=0.000 s, total=0.044 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB LOG: database system is shut down |
TERM | Smart Shutdown | LOG: received smart shutdown request LOG: autovacuum launcher shutting down LOG: shutting down LOG: checkpoint starting: shutdown immediate LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.028 s, sync=0.000 s, total=0.048 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB LOG: database system is shut down |
QUIT | Immediate Shutdown | LOG: received immediate shutdown request WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: database system is shut down |
ABRT | Forced termination by signal 6 | No Log during shutdown Log while startup: database system was interrupted; last known up at 2019-01-31 20:46:15 IST database system was not properly shut down; automatic recovery in progress invalid record length at 41/53969448: wanted 24, got 0 redo is not required checkpoint starting: end-of-recovery immediate checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.027 s, sync=0.000 s, total=0.030 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB MultiXact member wraparound protections are now enabled database system is ready to accept connections autovacuum launcher started |
HUP | Reload configuration file | LOG: received SIGHUP, reloading configuration files |
USR1 | NA | NA |
USR2 | NA | NA |
I have also passed the same signals to the other background processes and the results are here.
Signal | Writer process | Wal writer process | Autovacuum launcher process | Checkpointer Process | stats collector Process |
---|---|---|---|---|---|
INT | Detraction of the running transaction | Detraction of the running transaction, process restart | Detraction of the running transaction | Request of the checkpoint execution | |
TERM | Normal exit and restart | Normal exit and restart | Normal exit and restart | ||
QUIT | force termination of all processes | force termination of all processes | force termination of all processes | force termination of all processes | |
ABRT | Forced termination by signal 6 | Forced termination by signal 6 | Forced termination by signal 6 | Forced termination by signal 6 | |
HUP | Reload configuration file | Reload configuration file | Reload configuration file | Reload configuration file | Reload configuration file |
USR1 | NA | NA | NA | NA | |
USR2 | NA | NA | NA | NA |
Bonus: Why usage of kill -9 is a crime?
As discussed above, kill -9 not only terminates the single process but also reinitiates all the processes out there.
Postmaster then resets each and every background process after any process termination by KILL -9.
Let’s understand the concept with an example.,
1 2 3 4 5 6 7 8 9 |
[postgres@postgreshelp ~]$ ps -ef | grep postgres postgres 7913 1 0 Feb01 ? 00:00:01 /usr/local/pgsql/bin/postgres -D /u02/pgsql/data postgres 7914 7913 0 Feb01 ? 00:00:00 postgres: logger process postgres 7916 7913 0 Feb01 ? 00:00:00 postgres: checkpointer process postgres 7917 7913 0 Feb01 ? 00:00:05 postgres: writer process postgres 7918 7913 0 Feb01 ? 00:00:03 postgres: wal writer process postgres 7919 7913 0 Feb01 ? 00:00:06 postgres: autovacuum launcher process postgres 7920 7913 0 Feb01 ? 00:00:03 postgres: stats collector process postgres 20568 7913 0 20:28 ? 00:00:00 postgres: testuser testdb [local] idle |
Here, 20568 is the process ID for client connection which is in an idle state.
Let’s kill the session.
1 2 |
[postgres@asristgdb ~]$ kill -9 20568 [postgres@asristgdb ~]$ |
the log was captured as
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
LOG: server process (PID 20568) was terminated by signal 9: Killed LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted; last known up at 2019-02-01 23:11:41 IST LOG: database system was not properly shut down; automatic recovery in progress LOG: invalid record length at 41/E0D53F60: wanted 24, got 0 LOG: redo is not required LOG: checkpoint starting: end-of-recovery immediate LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.026 s, sync=0.000 s, total=0.029 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections LOG: autovacuum launcher started |
the postmaster process restarted all the mandatory background processes as well.
we can confirm it with ps command.
1 2 3 4 5 6 |
[postgres@asristgdb ~]$ ps -ef | grep postgres postgres 20730 7913 0 20:30 ? 00:00:00 postgres: checkpointer process postgres 20731 7913 0 20:30 ? 00:00:00 postgres: writer process postgres 20732 7913 0 20:30 ? 00:00:00 postgres: wal writer process postgres 20733 7913 0 20:30 ? 00:00:00 postgres: autovacuum launcher process postgres 20734 7913 0 20:30 ? 00:00:00 postgres: stats collector process |
A new process ID is created for each process.
Sometimes the reset may lead to corruption or errors.
So, it is highly recommended not to use KILL -9 on any PostgreSQL process including the idle process.
Instead, we can use pg_cancel_backend(pidint) or pg_terminate_backend(pidint) signals (SIGINT or SIGTERM respectively) to backend processes identified by process ID.
Takeaways:
- Postmaster is the parent process for each and every PostgreSQL process.
- Postmaster restarts the other background processes if they are down for any reason.
- Postmaster creates a new process for every client request.
- Do not use kill -9 in your environment.
Have you used kill -9 in your environment and suffered serious damage?
Let us know in the comments.
Words from postgreshelp
Thank you for giving your valuable time to read the above information. I hope the content served your purpose in reaching out to the blog.
Suggestions for improvement of the blog are highly appreciable. Please contact us for any information/suggestions/feedback.
If you want to be updated with all our articles
please follow us on Facebook | Twitter
Please subscribe to our newsletter.
Nice Article and more informative.
Thank you very much
Please do subscribe and read upcoming posts, need all your feedback
Nice article , great info which helped me a lot to understand production issue(DB was in recovery mode after doing kill -9 pid for an update statement) in my environment.
Thank you for your valuable feedback.
Please do subscribe and read upcoming posts, need all your feedback
Please post more articles on postgres…