Production Digest
postgreshelp.com
~6 min read
Autovacuum doesn't tell you when it's losing. This query does. I run it every Monday morning โ takes 2 seconds, has saved me from XID wraparound emergencies twice.
SELECT psu.schemaname || '.' || psu.relname AS table_name, psu.n_dead_tup, age(pc.relfrozenxid) AS xid_age, 2000000000 - age(pc.relfrozenxid) AS xids_until_wraparound, psu.last_autovacuum FROM pg_stat_user_tables psu JOIN pg_class pc ON pc.oid = psu.relid WHERE age(pc.relfrozenxid) > 500000000 OR psu.n_dead_tup > 1000000 OR psu.last_autovacuum < now() - interval '7 days' ORDER BY xid_age DESC; -- xids_until_wraparound < 200M โ start watching closely -- xids_until_wraparound < 50M โ manual VACUUM FREEZE, now
Add it to your weekly DBA checklist. Alert at 200M XID age. XID wraparound has caused full-database emergency shutdowns at well-monitored companies โ because this one number wasn't on anyone's dashboard.
You filter on status = 'active' AND country = 'US'. The planner estimates 48% of rows match. The actual result is 75%. It chose a sequential scan on a 200GB table. Your query runs for 4 minutes instead of 8 seconds.
This isn't a bug. It's the planner assuming column independence โ multiplying selectivities as if status and country have no relationship. On most real-world data, that assumption is wrong. Active users skew toward certain regions. High-value orders cluster by country. The planner has no idea unless you tell it.
-- Tell the planner these columns are correlated CREATE STATISTICS stat_status_country (dependencies) ON status, country FROM orders; -- Recalculate stats to activate it ANALYZE orders; -- Verify the extended stats were created SELECT stxname, stxkeys, stxkind FROM pg_statistic_ext; -- Check if the planner is now using them SELECT e.stxname, d.stxdependencies FROM pg_statistic_ext e JOIN pg_statistic_ext_data d ON e.oid = d.stxoid WHERE e.stxname = 'stat_status_country';
Extended statistics are not created automatically โ not by autovacuum, not by ANALYZE, not by anything. You have to identify the correlated columns yourself and create them manually. On a 6TB table with skewed status+country distribution, this single command took one of our worst queries from a 4-minute sequential scan to an 8-second index scan. The data hadn't changed. The schema hadn't changed. The planner just finally knew the truth.
โ Reproducible simulation on GitHub (PG18 output included)
Candidates to check in your schema: any column pair where one value implies another. order_status + payment_method. region + warehouse_id. user_type + subscription_plan. If you've ever wondered why the planner picked a terrible plan on a filtered query โ this is often why.
VACUUM FULL is not "a stronger VACUUM"Someone in our PostgreSQL community ran VACUUM FULL on a large orders table this week to "properly clean it up." They were still waiting hours later. The table was locked for reads and writes the entire time.
What VACUUM FULL actually does: acquires AccessExclusiveLock, rewrites the entire table to a new file on disk, and doubles your I/O temporarily. Duration depends on table size, disk speed, and server load โ it can range from minutes to many hours. No reads. No writes. No graceful degradation while it runs.
Before reaching for VACUUM FULL, run this first:
-- Step 1: Check if you actually have a bloat problem worth fixing SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) AS size, round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 1) AS dead_pct FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY dead_pct DESC; -- Real output from pgbench test environment: -- relname | size | dead_pct -- -----------------+--------+---------- -- pgbench_accounts | 299 MB | 50.0
Context matters. A 10MB table with 50% bloat wastes 5MB โ autovacuum will handle it on its next pass, nothing to do. A 300MB table with 50% bloat is worth investigating but still not necessarily a reason to run VACUUM FULL. Ask: is this causing actual query slowdowns? Is autovacuum keeping up? Only reach for aggressive tooling when bloat is impacting performance and the table is large enough that the wasted space matters operationally.
-- Step 2: If bloat IS impacting performance on a large table, -- use pg_repack โ rebuilds online with no AccessExclusiveLock -- pg_repack --table orders --jobs 2 mydb -- Regular VACUUM first โ often enough on its own VACUUM (VERBOSE, ANALYZE) orders;
The real question before any bloat operation: is autovacuum configured correctly, or is it just falling behind? Fix the root cause first โ aggressive bloat removal on top of a misconfigured autovacuum is treating the symptom, not the problem.
An interview with Masao Fujii, PostgreSQL committer at NTT Data, on why long transactions are more dangerous than most DBAs realise. The part that stuck with me: MVCC keeps dead tuple versions around as long as any transaction โ even an idle one โ holds an old snapshot. One forgotten connection left open overnight can prevent VACUUM from reclaiming anything on your busiest tables. His recommendation: set idle_in_transaction_session_timeout in production and treat it as a safety net, not a nice-to-have. I've added it to our standard server config after reading this. Worth 10 minutes of your time if you've ever been surprised by table bloat that autovacuum "should have caught."
A rigorous head-to-head benchmark across 17 test cases using JSONB in PostgreSQL 18 against MongoDB 7. The headline: PostgreSQL won 9 of 17. It dominated single-document reads, deletes, and array containment queries on smaller documents. MongoDB won on sorted pagination (4ร faster) and aggregate queries on large complex documents. The storage difference was the most surprising finding โ MongoDB stores data 2ร more compactly thanks to built-in compression. My take: if your team is debating whether to add MongoDB for a JSON-heavy service, read this first. The answer is almost always "stay with PostgreSQL, add a GIN index."
PostgreSQL's built-in JIT uses LLVM, which has millisecond-to-second compilation startup โ making it useless for transactional workloads where queries run in microseconds. pg_jitter is a new alternative JIT provider using lighter backends (sljit, AsmJit, MIR) that compile in microseconds instead. It supports PG14โ18 and could make JIT practical for mixed OLTP+analytics workloads for the first time. I haven't run it in production yet โ it's too new to trust on critical systems โ but it's on my staging list. Worth watching if you've ever disabled JIT because the overhead wasn't worth it.
14 years ยท 6TB ยท 60TB production systems