Case study: Operating system kill signals on PostgreSQL

Happy day everyone!!

As we all know, PostgreSQL highly interacts with the operating system for the operations that it does on the database.

Now, before we deep dive into the nitty gritty of the case study that I have done with the Operating System’s “system calls” and “kill signals” on PostgreSQL background processes, let’s understand what’s there inside.

Table of contents

1. Linux basics to understand PostgreSQL processes.
2. PostgreSQL and Linux correlation.
3. Case study.

Linux basics to understand PostgreSQL processes.

What is the process ?

A process is an executing instance of a program. It carries out tasks within the operating system.

Each process is allocated a unique number, called the process identifier (PID). It’s an integer between 2 and 32,768. When a process is started, the numbers restart from 2, and the number 1 is typically reserved for the init process. Process #1 manages other processes.

Process Table

The process table describes all the processes that are currently loaded. The ps command shows the processes. By default, it shows only processes that maintain a connection with a terminal, a console, a serial line, or a pseudo terminal.

To see all processes, we use -e option and -f to get full information.

(ps -ef) Example .,

postgres 28056 1 0 21:51 pts/0 00:00:01 /usr/local/pgsql/bin/postgres -D /u02/pgsql/data

Here., 1 is init process.
28056 is the child process.

System Call

A system call sometimes referred to as a kernel call, is a request in a Unix-like operating system made via a software interrupt by an active process for a service performed by the kernel.

fork() system call

We can create a new process by calling fork(). This system call duplicates the current process, creating a new entry in process table with many of the same attributes as the current process. nothing but, the newly created process will be the child of the calling process.

The PPID for the newly created process would be the parent’s PID.

Signals

The signal is a notification, a message sent by either operating system or some application to our program.

i) A signal may be sent from the kernel to a process
ii)From a process to another process
iii)From a process to itself.

Signal typically alerts a process to some event depending on the signal type.

Linux kernel implements about 30 signals. Each signal identified by a number, from 1 to 31.

Example., SIGKILL or signal number 9 tells the program that someone tries to kill it.

Now, Let’s correlate Linux processes and signals with the PostgreSQL database.

PostgreSQL and Linux correlation

When you start your PostgreSQL, the first process to start is your postmaster process. postmaster then creates multiple background processes.

If you check the relationships between processes with the pstree command, you can see that the Postmaster process is the parent process of all processes. When the postmaster starts it needs to know the location of the data area. The location must be specified by the -D option or the PGDATA environment variable

Here, the process “7913” is your postmaster process. It then created multiple child processes called PostgreSQL background processes.

As you can see, the PPID for all the background processes is your PID(7913) of the postmaster process.

Postmaster process is also responsible for creating a backend process for the client connection.

Also read, how a PostgreSQL connection is established

Connection request:

we can also see the processes list with pstree

PostgreSQL and OS Signals

PostgreSQL uses the following Operating system signals when you make any call on the database processes.

1. ABRT
2. INT
3. QUIT
4. TERM
5. HUP
6. USR1
7. USR2

The following table describes the usage of each signal.

SIGNALUSAGE
INTexternal interrupt, usually initiated by the user., at client side it is the results of a Control-C which normally cancels a running program
TERMThe SIGTERM signal is a generic signal used to cause program termination, equivalent to KILL PID.This is used for graceful termination of a process
QUITIt is a more forceful request. It shall terminate ungraceful, still cleaning up resources that absolutely need cleanup, but may not delete temporary files.This signal is generated when a user presses Ctrl+\
ABRTSIGABRT causes abnormal program termination
HUPIt is a hang up request, it is used to tell the process to reinitialize itself
USR1User defined signal 1
USR2User defined signal 2

Case study

I sent all the signals that are mentioned in the above table to the postmaster and the other background processes and captured the result.

The following table shows the behavior of the system when I sent the KILL signals to the postmaster process.

Syntax: pg_ctl kill SIGNALNAME PID

Example., /usr/local/pgsql/bin/pg_ctl kill INT 20958

where 20958 is the PID for postmaster process.

SIGNALPOSTMASTER PROCESSLOG
INTFast ShutdownLOG: received fast shutdown request
LOG: aborting any active transactions
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: checkpoint starting: shutdown immediate
LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.026 s, sync=0.000 s, total=0.044 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB
LOG: database system is shut down
TERMSmart Shutdown LOG: received smart shutdown request
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: checkpoint starting: shutdown immediate
LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.028 s, sync=0.000 s, total=0.048 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB
LOG: database system is shut down
QUITImmediate ShutdownLOG: received immediate shutdown request
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: database system is shut down
ABRTForced termination by signal 6No Log during shutdown
Log while startup:
database system was interrupted; last known up at 2019-01-31 20:46:15 IST
database system was not properly shut down; automatic recovery in progress
invalid record length at 41/53969448: wanted 24, got 0
redo is not required
checkpoint starting: end-of-recovery immediate
checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.027 s, sync=0.000 s, total=0.030 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB
MultiXact member wraparound protections are now enabled
database system is ready to accept connections
autovacuum launcher started
HUPReload configuration fileLOG: received SIGHUP, reloading configuration files
USR1NANA
USR2NANA

I have also passed the same signals to the other background processes and the results are here.

SignalWriter processWal writer processAutovacuum launcher processCheckpointer Processstats collector
Process
INTDetraction of the running transactionDetraction of the running transaction, process restart Detraction of the running transactionRequest of the checkpoint execution
TERMNormal exit and restartNormal exit and restartNormal exit and restart
QUITforce termination of all processesforce termination of all processesforce termination of all processesforce termination of all processes
ABRTForced termination by signal 6Forced termination by signal 6Forced termination by signal 6Forced termination by signal 6
HUPReload configuration fileReload configuration fileReload configuration fileReload configuration fileReload configuration file
USR1NANANANA
USR2NANANANA

Bonus: Why usage of kill -9 is a crime?

As discussed above, kill -9 not only terminates the single process but also reinitiates all the processes out there.

Postmaster then resets each and every background process after any process termination by KILL -9.

Let’s understand the concept with an example.,

Here, 20568 is the process ID for client connection which is in an idle state.

Let’s kill the session.

the log was captured as

the postmaster process restarted all the mandatory background processes as well.

we can confirm it with ps command.

A new process ID is created for each process.

Sometimes the reset may lead to corruption or errors.

So, it is highly recommended not to use KILL -9 on any PostgreSQL process including the idle process.

Instead, we can use pg_cancel_backend(pidint) or pg_terminate_backend(pidint) signals (SIGINT or SIGTERM respectively) to backend processes identified by process ID.

Takeaways:

  1. Postmaster is the parent process for each and every PostgreSQL process.
  2. Postmaster restarts the other background processes if they are down for any reason.
  3. Postmaster creates a new process for every client request.
  4. Do not use kill -9 in your environment.

Have you used kill -9 in your environment and suffered serious damage?

Let us know in the comments.

Words from postgreshelp

Thank you for giving your valuable time to read the above information. I hope the content served your purpose in reaching out to the blog.
Suggestions for improvement of the blog are highly appreciable. Please contact us for any information/suggestions/feedback.

If you want to be updated with all our articles

please follow us on Facebook Twitter
Please subscribe to our newsletter.

 

 

This Post Has 5 Comments

  1. Mahesh S

    Nice Article and more informative.

    1. Naresh

      Thank you very much
      Please do subscribe and read upcoming posts, need all your feedback

  2. vahini B

    Nice article , great info which helped me a lot to understand production issue(DB was in recovery mode after doing kill -9 pid for an update statement) in my environment.

    1. naresh

      Thank you for your valuable feedback.
      Please do subscribe and read upcoming posts, need all your feedback

  3. Rajesh

    Please post more articles on postgres…

Leave a Reply