Thursday, June 4, 2026
All posts
Lv.3 IntermediatePostgreSQL
20 min readLv.3 Intermediate
SeriesPostgreSQL Vacuum Complete Guide · Part 1View series hub

PostgreSQL Vacuum Part 1 — Why VACUUM Exists: MVCC and Dead Tuples

PostgreSQL Vacuum Part 1 — Why VACUUM Exists: MVCC and Dead Tuples

When DELETE leaves a PostgreSQL table the same size, it is not a bug — it is a direct consequence of MVCC design. The xmin/xmax/Snapshot visibility system prevents read-write conflicts but produces Dead Tuples as a byproduct, which accumulate into Table Bloat and degrade query performance. This part explains why VACUUM is a non-optional operational task: how Dead Tuples form, what standard VACUUM and VACUUM FULL each do, how Visibility Map accelerates both VACUUM and Index-Only Scans, and how Freeze prevents XID Wraparound data loss.

Series overview

  • Part 1 — Why VACUUM Exists: MVCC and Dead Tuples (current)
  • Part 2 — Autovacuum Tuning and Transaction ID Wraparound (upcoming)
  • Part 3 — Production Monitoring, Bloat Removal, and Operations (upcoming)

Table of Contents

  1. Why Is VACUUM Necessary?
  2. How MVCC Works
  3. How Dead Tuples Are Born
  4. PostgreSQL vs MySQL — Two MVCC Philosophies
  5. What VACUUM Does — Standard VACUUM vs VACUUM FULL
  6. Visibility Map and Freeze
  7. Part 1 Summary and What's Next

1. Why Is VACUUM Necessary?

Engineers new to PostgreSQL operations often run into a puzzling situation.

-- Deleted one million rows, and yet...
DELETE FROM events WHERE created_at < now() - interval '1 year';

-- The table size hasn't changed at all!
SELECT pg_size_pretty(pg_total_relation_size('events'));

Millions of rows were deleted, but disk usage barely moves. Query performance feels worse, not better. The cause lies in PostgreSQL's concurrency model — MVCC — and the solution is VACUUM.

The PostgreSQL documentation summarizes three reasons VACUUM is necessary:

  • Reclaim disk space occupied by updated or deleted rows
  • Update statistics used by the query planner
  • Prevent Transaction ID Wraparound (covered in depth in Part 2)

2. How MVCC Works

MVCC (Multi-Version Concurrency Control) implements the philosophy that "reads never block writes, and writes never block reads."

Transaction Snapshots and Visibility

When a transaction begins, PostgreSQL creates a Snapshot — essentially a photograph recording which transactions have committed at that moment. Every query reads data relative to its snapshot, so it receives consistent results without blocking even when another transaction is mid-write.

The Key System Columns: xmin and xmax

Every row (tuple) in PostgreSQL carries hidden system columns invisible to users.

ColumnMeaning
xminTransaction ID that created this row version
xmaxTransaction ID that deleted or updated this row version
-- Query the hidden system columns
SELECT xmin, xmax, id, name FROM my_table LIMIT 5;
  • On INSERT → new row created with xmin = current transaction ID, xmax = 0
  • On UPDATE → existing row's xmax set to current transaction ID; new row created with xmin = current transaction ID
  • On DELETE → existing row's xmax set to current transaction ID; row is not physically removed

Here is a concrete example of the flow.

-- Transaction 100: insert a row
BEGIN; -- XID = 100
INSERT INTO products (id, name) VALUES (1, 'Widget');
COMMIT;
-- Result: xmin=100, xmax=0

-- Transaction 200: update the row
BEGIN; -- XID = 200
UPDATE products SET name = 'Super Widget' WHERE id = 1;
COMMIT;
-- Old version: xmin=100, xmax=200  <- Dead Tuple born here!
-- New version: xmin=200, xmax=0    <- Live Tuple

3. How Dead Tuples Are Born

After the UPDATE above, two row versions coexist in the table. Once transaction 200 commits, the old version with xmax=200 is no longer visible to any transaction. That is a Dead Tuple.

[Heap Page structure]

+------------------+------------------+
| Tuple 1          | Tuple 2          |
| xmin=100         | xmin=200         |
| xmax=200 (DEAD)  | xmax=0   (LIVE)  |
| name='Widget'    | name='Super ...' |
+------------------+------------------+

Dead Tuples occupy space while serving no purpose. As they accumulate, two problems emerge.

① Table Bloat

As deletes and updates repeat, the ratio of useful data per page falls. Delete one million rows from a ten-million-row table and n_dead_tup = 1,000,000. A sequential scan must now read through all those useless pages.

② Query performance degradation

Index entries pointing to Dead Tuples remain in place. Every index scan that hits such an entry must visit the heap page unnecessarily, adding latency.

-- Monitor Dead Tuple status
SELECT
    relname,
    n_live_tup,
    n_dead_tup,
    round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
    last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Operational tip: When dead_pct climbs past 20%, it is time to review Autovacuum tuning.


4. PostgreSQL vs MySQL — Two MVCC Philosophies

Both databases implement MVCC, but their architectural philosophies are clearly different.

AspectPostgreSQLMySQL (InnoDB)
Where old versions liveInside the table heap (Heap-based)Separate Undo Log
Read performanceFast (no reconstruction needed)Undo Log chain traversal cost
Space managementDead Tuples accumulate → VACUUM requiredUndo Log self-managed
Bloat riskHigh (if management is neglected)Low (table always holds only current versions)
Design priorityRead-optimizedSpace-efficient

PostgreSQL's approach eliminates the overhead of reconstructing rows at read time. The trade-off is that Dead Tuples accumulate inside the table and must be cleaned up by VACUUM.


5. What VACUUM Does — Standard VACUUM vs VACUUM FULL

Standard VACUUM

VACUUM my_table;
VACUUM VERBOSE ANALYZE my_table; -- verbose output + statistics update
  • Lock level: ShareUpdateExclusive Lock — runs concurrently with normal reads and writes
  • Removes Dead Tuples and marks their space as reusable
  • Does not return disk space to the OS (except when trailing pages become fully empty)
  • Updates table statistics (can run alongside ANALYZE)

VACUUM FULL

VACUUM FULL my_table; -- use with care in production
  • Lock level: AccessExclusive Lock — locks the table completely
  • Rewrites the table and all its indexes from scratch → physically returns disk space to the OS
  • Requires extra disk space during execution (original file + new copy coexist)
  • Prefer pg_repack or pg_squeeze in production to avoid the full lock
[Standard VACUUM behavior]

Before:  [LIVE][DEAD][DEAD][LIVE][DEAD][LIVE]
After:   [LIVE][    ][    ][LIVE][    ][LIVE]
         ^ space is reusable but not returned to the OS

[VACUUM FULL behavior]

Before:  [LIVE][DEAD][DEAD][LIVE][DEAD][LIVE]
After:   [LIVE][LIVE][LIVE]
         ^ full rewrite -> actual file size shrinks

6. Visibility Map and Freeze

Visibility Map (VM)

The Visibility Map dramatically improves VACUUM performance. Each heap file has a companion file (with a _vm suffix) that stores one bit per page.

  • Bit = 1 → "all tuples on this page are visible to every active transaction" → VACUUM can skip this page entirely
  • Index-Only Scan optimization → Only when the VM bit is 1 can PostgreSQL skip the heap page lookup and return results from the index alone. If the bit is 0, the heap page must still be visited, so this optimization requires VACUUM to run regularly.

This mechanism allows VACUUM to process only pages that have actually changed, making it far more efficient on large tables.

Tuple Freezing (Preventing XID Wraparound)

PostgreSQL transaction IDs (XIDs) are 32-bit integers. After roughly 4.2 billion (2³²) transactions, the counter wraps around to zero. When it does, old rows can be mistakenly interpreted as rows created by "future transactions," potentially causing data loss in the worst case.

VACUUM prevents this by replacing the xmin of sufficiently old tuples with FrozenXID — permanently marking them as visible to everyone — in a process called Freeze.

-- Monitor XID exhaustion risk
SELECT
    datname,
    age(datfrozenxid) AS xid_age,
    2000000000 - age(datfrozenxid) AS xid_remaining
FROM pg_database
ORDER BY xid_age DESC;

Monitoring thresholds: The default autovacuum_freeze_max_age is 200 million. Begin active monitoring when xid_age exceeds 200 million. Take immediate action when it exceeds 800 million. In PostgreSQL 14+, once vacuum_failsafe_age (default 1.6 billion) is reached, an emergency VACUUM fires and can block all queries.


7. Part 1 Summary and What's Next

Part 1 Key Concepts

ConceptSummary
MVCCRows are never immediately removed on update or delete — old versions remain to preserve concurrency
Dead TupleA stale row version left by UPDATE/DELETE — wastes space and slows queries
xmin / xmaxTransaction IDs recording each row's creation and expiry — the engine of visibility decisions
Standard VACUUMNon-blocking removal of Dead Tuples; marks space as reusable
VACUUM FULLFull table rewrite to reclaim physical space; requires AccessExclusive Lock
Visibility MapSpeeds up VACUUM by skipping clean pages; enables Index-Only Scan optimization
FreezeMarks old tuples as permanently visible to prevent XID Wraparound data loss

Part 2 Preview: Autovacuum Tuning and XID Wraparound

The next part covers the most operationally critical topics.

  • Autovacuum internals: launcher, worker, cost-based throttling
  • Autovacuum trigger conditions: autovacuum_vacuum_threshold and scale_factor calculations
  • Per-table parameter overrides for large tables
  • VACUUM improvements across PostgreSQL 13–17
  • XID Wraparound disaster scenarios and emergency response procedures

References

  • PostgreSQL 17 Documentation — Routine Vacuuming
  • Google Cloud Blog — Deep dive into PostgreSQL VACUUM
  • Postgres Professional — MVCC in PostgreSQL: Vacuum

Share This Article

Series Navigation

PostgreSQL Vacuum Complete Guide

Current part 1 · 3 published

Explore this topic·Start with featured series

한국어

Follow new posts via RSS

Use RSS to get new posts and series updates directly.

Open RSS Guide