Wednesday, May 13, 2026
All posts
Lv.2 BeginnerPostgreSQL
20 min readLv.2 Beginner
SeriesPostgreSQL Backup & Recovery Complete Guide · Part 3/6View series hub

PostgreSQL Backup & Recovery Part 3 — pg_basebackup and WAL Archiving

PostgreSQL Backup & Recovery Part 3 — pg_basebackup and WAL Archiving

Pg_dump does not support PITR. The foundation of production disaster recovery is the combination of a Base Backup and WAL archiving. A Base Backup is a snapshot of a specific point in time; WAL archiving makes every moment after that snapshot recoverable. pg_basebackup option choices determine recoverability, and the native incremental backup introduced in PostgreSQL 17+ (--incremental) is an important development for reducing backup costs on large databases.

Series overview

  • Part 1 — Backup Fundamentals and Strategy
  • Part 2 — Logical Backup — pg_dump & pg_dumpall in Practice
  • Part 3 — Physical Backup — pg_basebackup and WAL Archiving (you are here)
  • Part 4 — PITR (Point-in-Time Recovery) Implementation Guide (coming soon)
  • Part 5 — Enterprise Backup Tools Compared — pgBackRest vs Barman vs WAL-G (coming soon)
  • Part 6 — Backup Automation & Monitoring, and Recovery Drills (coming soon)

Table of Contents

  1. Introduction — Why physical backup?
  2. WAL — The core concept behind physical backup
  3. Overall architecture — Base Backup + WAL archiving
  4. WAL archiving configuration
  5. pg_basebackup — Physical base backup
  6. Incremental backup — PostgreSQL 17+
  7. Parameter tuning guide
  8. WAL archiving monitoring
  9. Physical backup automation script
  10. Logical vs physical — When to use which
  11. Closing

1. Introduction — Why physical backup?

Part 2 covered the strengths and limits of pg_dump. Logical backup excels at portability and selective restore, but it does not directly support PITR (Point-in-Time Recovery), which is the core requirement for production disaster recovery.

Consider the scenario that makes physical backup indispensable.

"Someone dropped a critical table at 2 PM today. The last pg_dump backup is from 2 AM yesterday. Twelve hours of data is gone."

If you had physical backup and WAL archiving in place, you could recover to exactly 1:59 PM. That is PITR, and that is why physical backup exists.


2. WAL — The core concept behind physical backup

WAL (Write-Ahead Log) is PostgreSQL's transaction log. Before applying any change to data files, PostgreSQL records every change in the WAL first. WAL files are stored in the pg_wal/ subdirectory of the cluster data directory, organized into 16 MB segment files by default.

WAL serves two roles.

Crash Safety: When the system shuts down abnormally, PostgreSQL replays the WAL to restore the database to a consistent state.

Foundation for PITR: By replaying WAL on top of a Base Backup in sequence, you can recover the database to any point in time after the backup.


3. Overall architecture — Base Backup + WAL archiving

A production physical backup strategy is the combination of two components.

A Base Backup is a full snapshot of the data directory at a specific point in time. Alone, it cannot recover changes made after the backup.

WAL archiving preserves every change that occurred after the Base Backup. Only when the two are combined does PITR become possible.


4. WAL archiving configuration

4.1 Required postgresql.conf parameters

WAL archiving is disabled by default. Enabling it requires a server restart, so configure it before the service goes live.

# postgresql.conf

# WAL level: must be replica or higher for archiving and PITR
# minimal does not allow archive_mode to be enabled
wal_level = replica

# Enable archiving (requires server restart)
archive_mode = on

# Command to copy WAL files to the archive directory
# %p: full path of the WAL file to archive
# %f: WAL filename only
archive_command = 'test ! -f /var/lib/postgresql/archive/%f && cp %p /var/lib/postgresql/archive/%f'

# Force archiving of incomplete WAL segments after this many seconds
# Lower values archive recent WAL faster, shortening RPO
archive_timeout = 300

# WAL size settings
max_wal_size = 4GB
min_wal_size = 1GB
wal_compression = on

archive_mode requires a server restart. archive_command, however, can be updated without a restart using pg_reload_conf() or SIGHUP. When you need to temporarily pause archiving, modifying archive_command is operationally more flexible than toggling archive_mode.

4.2 archive_mode: on vs always

ModeBehavior
onArchives from the primary only
alwaysArchives from standby servers as well, including WAL received via streaming replication

always is useful when the standby server is closer to the archive storage, or when WAL archiving must continue even if the primary fails.

4.3 The archive_command exit code rule

PostgreSQL treats a WAL file as successfully archived only when archive_command returns exit code 0. A non-zero return triggers a retry loop.

This behavior has direct operational consequences.

  • Silently ignoring all failures (always returning exit 0) risks deleting WAL files that were never actually archived.
  • If the archive destination becomes unreachable and retries accumulate, the pg_wal/ directory fills up and causes a disk-full failure.

Error handling in archiving scripts must be designed carefully.

4.4 Preparing the archive directory

# Create the archive directory and set permissions
mkdir -p /var/lib/postgresql/archive
chown postgres:postgres /var/lib/postgresql/archive
chmod 700 /var/lib/postgresql/archive

# Restart after changing postgresql.conf
systemctl restart postgresql

# Verify archiving status
psql -U postgres -c "SELECT * FROM pg_stat_archiver;"

4.5 Remote archiving

Using local disk only means the archive is lost if the server itself fails. Remote storage archiving is mandatory.

# Remote archiving with rsync
archive_command = 'rsync -a %p backup_user@archive-server:/pg_archive/%f'

# AWS S3
archive_command = 'aws s3 cp %p s3://my-pg-archive/wal/%f'

# Custom dual-storage script
archive_command = '/usr/local/bin/archive_wal.sh %p %f'

A dual-storage script that writes to both local and S3:

#!/bin/bash
# /usr/local/bin/archive_wal.sh
# Safe WAL archiving: local + S3 dual storage

WAL_PATH=$1
WAL_FILE=$2
LOCAL_ARCHIVE="/var/lib/postgresql/archive"
S3_BUCKET="s3://my-pg-archive/wal"

# Fail immediately on local copy failure (exit 1 → PostgreSQL retries)
cp "${WAL_PATH}" "${LOCAL_ARCHIVE}/${WAL_FILE}" || exit 1

# S3 failure occurs after local success — allow archiving to proceed,
# but log and alert for follow-up. Change to exit 1 if S3 is mandatory.
aws s3 cp "${WAL_PATH}" "${S3_BUCKET}/${WAL_FILE}" \
  --only-show-errors || \
  echo "[WARN] S3 upload failed for ${WAL_FILE}" >> /var/log/pg_archive.log

exit 0

Returning exit 0 on S3 failure is a deliberate design choice: prioritize alerting over blocking archiving. Teams that require S3 as mandatory should return exit 1 there instead. Either way, document the intent explicitly in the script or operational runbook.


5. pg_basebackup — Physical base backup

5.1 Prerequisites

# postgresql.conf
wal_level = replica       # required
archive_mode = on         # required for PITR
max_wal_senders = 3       # pg_basebackup uses the replication protocol
-- Add replication access to pg_hba.conf
-- host replication backup_user 192.168.1.0/24 scram-sha-256

-- Dedicated backup user (no superuser needed — REPLICATION privilege only)
CREATE USER backup_user WITH REPLICATION PASSWORD 'securepass';

5.2 Key options

-X (WAL inclusion)

OptionDescriptionRecommendation
-X noneNo WAL includedNot recommended — recovery may be impossible
-X fetchWAL collected after backup completesConditional — WAL preservation not guaranteed
-X streamWAL streamed in real time during backupRecommended

-X stream is recommended because it eliminates the risk of WAL segments being recycled before the backup completes.

-F (output format)

OptionDescription
-Fp (plain)Files copied as-is. Ready to use directly for recovery
-Ft (tar)Per-tablespace tar archives. Better for transfer and storage

--checkpoint

# Force a fast checkpoint (shorter startup, brief I/O spike)
pg_basebackup ... --checkpoint=fast

# Spread checkpoint (default, distributes I/O load)
pg_basebackup ... --checkpoint=spread

5.3 Backup recipes

# Recommended: plain format + WAL streaming + fast checkpoint
pg_basebackup \
  -h localhost \
  -U backup_user \
  -D /backups/basebackup/$(date +%Y%m%d_%H%M%S) \
  -Fp \
  -Xs \
  -P \
  --checkpoint=fast

# Tar format + compression (lower transfer and storage cost)
pg_basebackup \
  -h localhost \
  -U backup_user \
  -D /backups/basebackup/$(date +%Y%m%d) \
  -Ft \
  -z \
  -Xs \
  -P \
  --checkpoint=fast

# Integrity verification after backup (PostgreSQL 13+)
pg_verifybackup /backups/basebackup/20260414

5.4 pg_verifybackup — Backup integrity check

pg_verifybackup, introduced in PostgreSQL 13, verifies backup integrity using the backup_manifest file.

# Full verification
pg_verifybackup /backups/basebackup/20260414

# Skip WAL verification (data files only, faster)
pg_verifybackup --no-parse-wal /backups/basebackup/20260414

Not usable for backups created with PostgreSQL 12 or earlier (no backup_manifest).


6. Incremental backup — PostgreSQL 17+

6.1 Overview

Through PostgreSQL 16, pg_basebackup only supported full backups. Incremental backup required third-party tools like pgBackRest or Barman.

PostgreSQL 17 introduced native incremental backup (--incremental). Only blocks changed since the previous backup are stored, significantly reducing backup time and storage.

6.2 Enabling requirements

The summarize_wal parameter must be enabled. Introduced in PostgreSQL 17, it starts the WAL Summarizer process.

# postgresql.conf (PostgreSQL 17+)
summarize_wal = on   # default: off, requires server restart

6.3 Incremental backup workflow

# Step 1: Full backup (baseline)
pg_basebackup \
  -h localhost \
  -U backup_user \
  -D /backups/full_20260414 \
  -Ft -Xs -P --checkpoint=fast

# backup_manifest is created at /backups/full_20260414/backup_manifest

# Step 2: First incremental (based on full backup)
pg_basebackup \
  --incremental=/backups/full_20260414/backup_manifest \
  -h localhost \
  -U backup_user \
  -D /backups/incr_20260415 \
  -Ft -Xs -P --checkpoint=fast

# Step 3: Second incremental (based on previous incremental)
pg_basebackup \
  --incremental=/backups/incr_20260415/backup_manifest \
  -h localhost \
  -U backup_user \
  -D /backups/incr_20260416 \
  -Ft -Xs -P --checkpoint=fast

6.4 Restore: pg_combinebackup

Incremental backups cannot be restored on their own. pg_combinebackup synthesizes the full backup and its incremental chain into a restorable data directory.

# Combine full + incr1 + incr2 into a restorable directory
pg_combinebackup \
  /backups/full_20260414 \
  /backups/incr_20260415 \
  /backups/incr_20260416 \
  -o /restore/combined_datadir

# Start PostgreSQL from the combined directory
pg_ctl -D /restore/combined_datadir start

Backups must be passed to pg_combinebackup in chronological order — providing them out of order causes an error. The first entry in the chain must always be a Full Backup.

PostgreSQL 17's native incremental backup may have fewer features than pgBackRest or Barman. Review the official documentation against your requirements before adopting it in production.


7. Parameter tuning guide

# Key parameters for WAL archiving and physical backup

# WAL level (replica or higher required)
wal_level = replica

# Archiving
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
archive_timeout = 300          # force-archive incomplete WAL every 5 minutes

# WAL size control
max_wal_size = 4GB             # max WAL size between checkpoints
min_wal_size = 1GB             # minimum WAL reserved for recycling
wal_compression = on           # WAL compression (CPU overhead vs storage savings)
wal_buffers = -1               # auto-set (1/32 of shared_buffers, max 16MB)

# Checkpoint
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9

# Replication connections (for pg_basebackup)
max_wal_senders = 5
wal_keep_size = 1GB            # give standbys room to catch up

# Incremental backup (PostgreSQL 17+)
summarize_wal = on

8. WAL archiving monitoring

WAL archiving failures are silent. The pg_wal/ directory accumulates segments until a disk-full failure occurs. Monitoring is mandatory.

-- Full archiving status check
SELECT
  archived_count,
  last_archived_wal,
  last_archived_time,
  failed_count,
  last_failed_wal,
  last_failed_time,
  now() - last_archived_time AS lag
FROM pg_stat_archiver;
ColumnMeaning
failed_countCumulative archive failures. Any non-zero value needs immediate investigation
last_failed_timeTimestamp of the most recent failure
lagTime elapsed since the last successful archive
# Monitor pg_wal directory size (alert if growing abnormally)
du -sh $(psql -U postgres -tAc "SHOW data_directory")/pg_wal

Archiving failure alert script:

#!/bin/bash
# WAL archiving health check cron (recommended: every 5 minutes)

FAILED=$(psql -U postgres -tAc \
  "SELECT failed_count FROM pg_stat_archiver")

if [ "$FAILED" -gt 0 ]; then
  LAST_FAIL=$(psql -U postgres -tAc \
    "SELECT last_failed_wal || ' at ' || last_failed_time FROM pg_stat_archiver")
  echo "WAL archiving failure detected: ${LAST_FAIL}" | \
    mail -s "[ALERT] PostgreSQL WAL Archiving Failure" ops@example.com
fi

9. Physical backup automation script

#!/bin/bash
# /usr/local/bin/pg_physical_backup.sh

set -euo pipefail

DB_HOST="localhost"
DB_USER="backup_user"
BACKUP_ROOT="/backups/basebackup"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="${BACKUP_ROOT}/${TIMESTAMP}"
RETENTION_DAYS=7
LOG_FILE="/var/log/pg_physical_backup.log"

log() {
  echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "${LOG_FILE}"
}

log "Physical backup started"

mkdir -p "${BACKUP_DIR}"
pg_basebackup \
  -h "${DB_HOST}" \
  -U "${DB_USER}" \
  -D "${BACKUP_DIR}" \
  -Ft \
  -z \
  -Xs \
  --checkpoint=fast \
  -P \
  2>>"${LOG_FILE}"

# Integrity verification (PostgreSQL 13+)
if pg_verifybackup "${BACKUP_DIR}" 2>>"${LOG_FILE}"; then
  BACKUP_SIZE=$(du -sh "${BACKUP_DIR}" | cut -f1)
  log "Backup complete and verified: ${BACKUP_DIR} (${BACKUP_SIZE})"
else
  log "ERROR: Backup integrity verification failed! ${BACKUP_DIR}"
  # Verify BACKUP_DIR is under BACKUP_ROOT before deleting
  if [[ "${BACKUP_DIR}" == "${BACKUP_ROOT}/"* ]]; then
    rm -rf "${BACKUP_DIR}"
  fi
  exit 1
fi

# Remove expired backups with per-entry logging
find "${BACKUP_ROOT}" -maxdepth 1 -mindepth 1 -type d \
  -mtime +${RETENTION_DAYS} -print | while read -r OLD_DIR; do
  log "Removing expired backup: ${OLD_DIR}"
  rm -rf "${OLD_DIR}"
done
# crontab -e
# Run physical backup daily at 1 AM
0 1 * * * /usr/local/bin/pg_physical_backup.sh

10. Logical vs physical — When to use which

CriterionLogical (pg_dump)Physical (pg_basebackup)
Backup speedSlow (large DBs)Fast
Restore speedSlowFast
PITR supportNot supportedSupported (with WAL archiving)
Selective restoreSupported (table/schema level)Not supported (full cluster)
Cross-version portabilitySupported (lower → higher version)Not supported (same major version)
Standby initializationNot supportedSupported
Incremental backupNot supportedSupported (PostgreSQL 17+)
Recommended useMigration, selective restore, dev environmentsDisaster recovery, PITR, HA setup

Recommended combination in practice:

# Daily: physical backup + continuous WAL archiving → PITR foundation
pg_basebackup + archive_command

# Weekly: logical backup → portability and selective restore
pg_dump

11. Closing — WAL completes the backup chain

pg_basebackup creates a complete snapshot of a specific point in time. But alone, it is just a photograph taken yesterday. When WAL archiving is added, every moment between that snapshot and the present becomes recoverable.

"Physical backup lays the foundation. WAL archiving builds time on top of it."

The native incremental backup (--incremental) introduced in PostgreSQL 17 is a significant advance for reducing backup costs in large database environments. It may still have fewer features than third-party tools — compare the official documentation against your team's requirements before adopting it.

The next part covers implementing PITR step by step using the physical backup and WAL archiving covered here.


References

Share This Article

Series Navigation

PostgreSQL Backup & Recovery Complete Guide

3 / 6 · 4

Explore this topic·Start with featured series

한국어

Follow new posts via RSS

Until the newsletter opens, RSS is the fastest way to get updates.

Open RSS Guide