Table of Contents
Context – 1
Whether you’re a novice embarking on a career as a database administrator or an experienced professional from another RDBMS technology, People often ask me about how to kickstart the PostgreSQL administration journey or take it to the next level
Context – 2
Typical use cases that we hear from the PostgreSQL community or the requirement from your manager
- I want to set up PostgreSQL on my on-premise.
- Can you help me with PostgreSQL configuration best practices?.
- Can you help me migrate my Oracle/SQL Server database to PostgreSQL?
- I am pushing my data from Source PostgreSQL to Destination with Fivetran’s integration and its slow
- Can you analyze the possibilities of using pgvector in our environment?
- I need multi-master at any cost, explore the pitfalls.
- How can I integrate my PostgreSQL with external tools or the resources available to enhance or scale up my business?
- I want to migrate my workload from on-premise to AWS Cloud…no..no..Azure., no GCP. You tell me where should I go?
and many more.
Not everyone gets the opportunity to work on all these case studies. My friend once said, “If not every birthday, celebrate your alternate birthday with a different team during the initial phase/years of your career. The more projects you change the more you learn because not every project follows the same set of tools.”
If I am not switching quite often then (1) How can I get the list of case studies and ensure myself updated with the latest happenings in PostgreSQL and (2) what shall I do now?
In this post, we are going to understand PostgreSQL resources, hands-on exercises, and recommended focal points settings the foundation for an effective PostgreSQL roadmap. The blog post concentrates only on the software consumer’s point of view, not the creator’s.
Download the presentation (containing simulations) from here
Break it down
The beginning:
Before learning PostgreSQL administration assume that it is just any other software that you install on your computer to get the work done.
For example., to create your resume, you need MS Word and you get it with MS Office software.
You first INSTALL the MS Office software, probably at a default location (C:/programfiles/MSOffice/bin/word.exe) to CREATE your Word document, and when done you save it, probably at (D:/MyDocs/MyDocument.doc).
Just to summarize, you install MS Office at[1] and create a document at[2]
C:/programfiles/MSOffice/bin/word.exe[1] D:/MyDocs/MyDocument.doc[2]
and your PostgreSQL is no exception. You need PostgreSQL software to be INSTALLED to CREATE your databases.
For example, when I check if PostgreSQL is running on my virtual machine, it returned
/usr/pgsql-16/bin/postgres -D /u01/pgsql/16
When you compare both,
C:/programfiles/MSOffice/bin/word.exe D:/MyDocs/MyDocument.doc
/usr/pgsql-16/bin/postgres -D /u01/pgsql/16
The left-hand side section is my installation location and the right-hand side section is my database files location.
If you want to learn MS Word, you just need to learn the tab bar. Go to Home bar -> Learn what is Format Painter, what is B, what is I, what is U, and then jump to Insert tab and so on.
When you are done with all tabs, that completes the Learning Word.
The same is applicable to PostgreSQL.
As I mentioned above when you install PostgreSQL you get two important locations, one, the INSTALLATION location consisting of binaries, and two, the DATABASE Files location consisting of a bunch of files and subfolders.
The learning approach is simple.,
Go to both locations, for example., Installation location
[postgres@lab01 ~]$ cd /usr/pgsql-16/bin
[postgres@lab01 bin]$ ls -lrt
total 13212
-rwxr-xr-x. 1 root root 9617 Jun 29 18:16 postgresql-16-setup
-rwxr-xr-x. 1 root root 2170 Jun 29 18:16 postgresql-16-check-db-dir
-rwxr-xr-x. 1 root root 93168 Jun 29 18:16 vacuumdb
-rwxr-xr-x. 1 root root 84520 Jun 29 18:16 reindexdb
-rwxr-xr-x. 1 root root 731352 Jun 29 18:16 psql
-rwxr-xr-x. 1 root root 9651400 Jun 29 18:16 postgres
-rwxr-xr-x. 1 root root 171792 Jun 29 18:16 pg_upgrade
..
..
-rwxr-xr-x. 1 root root 84296 Jun 29 18:16 createdb
-rwxr-xr-x. 1 root root 75936 Jun 29 18:16 clusterdb
Database files location:
[postgres@lab01 bin]$ cd /u01/pgsql/16
[postgres@lab01 16]$
[postgres@lab01 16]$ ls -lrt
total 68
drwx------. 4 postgres postgres 36 Jul 12 19:00 pg_multixact
..
drwx------. 2 postgres postgres 6 Jul 12 19:00 pg_tblspc
..
drwx------. 2 postgres postgres 6 Jul 12 19:00 pg_replslot
-rw-------. 1 postgres postgres 29697 Jul 12 19:00 postgresql.conf
-rw-------. 1 postgres postgres 88 Jul 12 19:00 postgresql.auto.conf
drwx------. 3 postgres postgres 60 Jul 12 19:00 pg_wal
drwx------. 2 postgres postgres 18 Jul 12 19:00 pg_subtrans
drwx------. 5 postgres postgres 33 Jul 12 19:00 base
..
drwx------. 4 postgres postgres 68 Jul 12 19:05 pg_logical
For example., I have a tool called psql, let me get the help command of that tool.
[postgres@lab01 15]$ psql --help
psql is the PostgreSQL interactive terminal.
Usage:
psql [OPTION]... [DBNAME [USERNAME]]
General options:
-c, --command=COMMAND run only single command (SQL or internal) and exit
-d, --dbname=DBNAME database name to connect to (default: "postgres")
-f, --file=FILENAME execute commands from file, then exit
-l, --list list available databases, then exit
-v, --set=, --variable=NAME=VALUE
set psql variable NAME to VALUE
(e.g., -v ON_ERROR_STOP=1)
....
....
Here, what is -c doing, how can I use it?
[postgres@lab01 15]$ psql -c "select * from emp";
id | sal
----+-----
1 | 100
(1 row)
[postgres@lab01 15]$
Repeat for all tools. Then you have a data directory.
Ask yourself.
- What is postgresql.conf file is doing there?
- what happens if I lose that?
- what is there inside it?
- what is wal_log_hints in postgresql.conf file and how can I test it functonality?
- what is that pg_wal folder? can I move that folder to a different mount point?
fig: showing disk activity on the second mount point when there is an update operation
Repeat for all files and folders.
Do I have dictionary views to view the information?
Yes, many. Break it down.
- Create charts or cheat sheets that make it easy for you to learn.
- Create a file that gets select * from <dictionary view> information and run it., then analyze the result..slowly.
How can I learn Cloud?
Boring, same stuff. Let’s say, you wanted to create an AWS instance, you have multiple ways to do that and not everyone will have the provision to try all.
You, as a learner, try everything. again.,
This is an excerpt from Create instance, questions are
- Why is the standard class, and why not the Memory-optimized class?
- Do I really get 4750 Mbps?
- How can I test that I am actually getting 3000 IOPS?
- What activities contribute to IOPS?
Extra’s
A couple of questions include.
- How can I add a node to a 3-node etcd cluster?
- What happens if my PGPOOL instance is down?
- How much connection establishment time I am saving with my pgBouncer, Do I really need it?
- what’s the below snippet doing?
[root@test01 tf-project]# terraform apply
postgresql_database.myfist_db: Refreshing state... [id=myfirst_db]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# postgresql_database.myfist_db will be created
+ resource "postgresql_database" "myfist_db" {
+ allow_connections = true
+ connection_limit = -1
+ encoding = (known after apply)
+ id = (known after apply)
+ is_template = (known after apply)
+ lc_collate = (known after apply)
+ lc_ctype = (known after apply)
+ name = "myfirst_db"
+ owner = (known after apply)
+ tablespace_name = (known after apply)
+ template = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
What resources do I have to take my learnings to the next level?
Refer to the presentation for the resource links and simulations.
It’s not a simple thing to describe everything in one blog post.
For example., when it comes to application tuning, we have 40+ nodes may be, like quick sort, seq scan, etc.,
People say,
- If there is a sub-query, it should be executed first provided that the values it pass to the outer column have too many distinct rows.
- We cannot start with the join operation first.
- Do not start with <, >, like kind of operators and so on and so forth.
but the thing is to have a checklist and try to ask questions every time you are learning something new in PostgreSQL like “Why do I need CTE over sub query or vice-versa, what difference it makes, and how it is influencing my optimizer?”
The best checklist to begin with would be PostgreSQL DBA roadmap and awesome-postgres
Happy learning.
Words from postgreshelp
Thank you for giving your valuable time to read the above information. I hope the content served your purpose in reaching out the blog.
Suggestions for improvement of the blog are highly appreciable. Please contact us for any information/suggestion/feedback.
If you want to be updated with all our articles
please follow us on Facebook | Twitter
Please subscribe to our newsletter.