Lv.2 BeginnerPostgreSQL

2026.05.1018 min readLv.2 Beginner

SeriesPostgreSQL Backup & Recovery Complete Guide · Part 2/6View series hub

PostgreSQL Backup & Recovery Part 2 — pg_dump and pg_dumpall in Practice

Pg_dump is the most familiar backup tool, but it is a complement — not a replacement — for production disaster recovery. It excels at selective restore, version migration, and dev environment cloning. Output format choice (plain/custom/directory) determines recovery flexibility and performance, and pg_restore's parallel and selective restore capabilities are the key to cutting real-world incident response time. pg_dumpall captures cluster-wide objects (roles, tablespaces) but is plain-SQL only, making a hybrid strategy essential for large-scale recovery.

Series overview

Part 1 — Backup Fundamentals and Strategy

Part 2 — Logical Backup — pg_dump & pg_dumpall in Practice (you are here)

Part 3 — Physical Backup — pg_basebackup and WAL Archiving (coming soon)

Part 4 — PITR (Point-in-Time Recovery) Implementation Guide (coming soon)

Part 5 — Enterprise Backup Tools Compared — pgBackRest vs Barman vs WAL-G (coming soon)

Part 6 — Backup Automation & Monitoring, and Recovery Drills (coming soon)

Introduction — Where logical backup fits
pg_dump vs pg_dumpall — What's the difference?
Output formats explained
pg_dump recipes
pg_restore recipes
pg_dumpall recipes
Password security — Using .pgpass
Backup automation scripts
Performance optimization for large databases
Limits of logical backup and when physical backup is required
Closing

1. Introduction — Where logical backup fits

Part 1 established PostgreSQL's three backup approaches: logical backup, physical backup, and WAL archiving. Each plays a distinct role, and combining them is what makes a production-grade strategy.

Logical backup is usually the first method you encounter. pg_dump is PostgreSQL's standard bundled tool, exporting a database as SQL or an archive format. Its key advantage: it produces a consistent snapshot without blocking other users' reads or writes.

The official PostgreSQL documentation is explicit about its scope.

"Except for simple situations, pg_dump is not suitable as the regular backup tool for a production database."

— postgresql.org/docs/current/app-pgdump.html

pg_dump is powerful for selective restore, major-version migration, and dev/test environment cloning. For production disaster recovery that requires PITR, physical backup is mandatory. Logical backup is meant to complement physical backup, not replace it.

2. pg_dump vs pg_dumpall — What's the difference?

The two tools differ in scope from the start.

Item	pg_dump	pg_dumpall
Scope	Single database	Entire cluster
Roles included	No	Yes
Tablespaces included	No	Yes
Output formats	plain, custom, directory, tar	plain (SQL) only
Parallel dump	Directory format only	Not supported
Selective restore	Supported	Not supported
Recovery tool	psql or pg_restore	psql

Internally, pg_dumpall calls pg_dump for each database and additionally exports cluster-wide objects (roles, tablespaces) as SQL. Note that snapshot timestamps are not fully synchronized across individual databases.

In practice, a hybrid strategy is common: use pg_dumpall --globals-only for global objects and pg_dump with parallel options for individual databases.

3. Output formats explained

Format choice has a direct impact on recovery flexibility and performance. pg_dump supports four formats.

Format	Flag	Compression	Parallel dump	Parallel restore	Selective restore	Best for
Plain (SQL)	`-Fp`	None	No	No	No	Small DBs, human-readable
Custom	`-Fc`	Built-in	No	Yes	Yes	Most production workloads
Directory	`-Fd`	Built-in	Yes	Yes	Yes	Large DBs, speed-critical
Tar	`-Ft`	None	No	Yes	Yes	Compatibility

Plain

A human-readable SQL file restored with psql. No selective restore or parallel processing. File size grows rapidly with large databases.

Custom

The default choice for most production environments. Built-in compression, pg_restore-based recovery with table/schema-level selection, and parallel restore (pg_restore -j) in a single file.

Tar

Same structure as Custom but without compression. No compelling reason to prefer it over Custom unless compatibility demands it.

4. pg_dump recipes

4.1 Basic backup

# Basic custom-format backup
pg_dump -Fc -U postgres -h localhost -d mydb \
  -f /backups/mydb_$(date +%Y%m%d_%H%M%S).dump

# Remote server backup
pg_dump -Fc -U postgres -h db.example.com -p 5432 -d mydb \
  -f /backups/mydb_remote.dump

4.2 Compression tuning

# No compression — prioritize restore speed
pg_dump -Fc -Z 0 -U postgres -d mydb -f mydb_nocompress.dump

# Maximum compression — minimize storage cost
pg_dump -Fc -Z 9 -U postgres -d mydb -f mydb_maxcompress.dump

# External compression — combine plain format with pigz
pg_dump -Fp -U postgres -d mydb | pigz -p 4 > mydb.sql.gz

Using -Z 0 to disable compression can speed up the overall dump when CPU is not the bottleneck. The effect varies by storage I/O profile — measure before committing.

4.3 Selective backup — tables and schemas

# Specific tables only
pg_dump -Fc -U postgres -d mydb \
  -t users -t orders \
  -f critical_tables.dump

# Wildcard pattern
pg_dump -Fc -U postgres -d mydb \
  -t 'public.user_*' \
  -f user_tables.dump

# Entire schema
pg_dump -Fc -U postgres -d mydb \
  -n analytics \
  -f analytics_schema.dump

# Exclude specific tables
pg_dump -Fc -U postgres -d mydb \
  -T audit_logs -T temp_sessions \
  -f mydb_no_logs.dump

When dumping specific tables, the foreign-key targets of those tables are not included. Restore may fail with dependency errors — include related tables or use --disable-triggers.

4.4 Schema-only, data-only, and section dumps

# Schema (DDL) only — migration, schema version control
pg_dump -Fp -s -U postgres -d mydb -f schema_only.sql

# Data only — load into an environment where schema already exists
pg_dump -Fc -a -U postgres -d mydb -f data_only.dump

# Section-based splitting
pg_dump -Fc --section=pre-data  -U postgres -d mydb -f pre_data.dump
pg_dump -Fc --section=data      -U postgres -d mydb -f data.dump
pg_dump -Fc --section=post-data -U postgres -d mydb -f post_data.dump

Section splitting shines in large-scale restores. Loading data first and building indexes last significantly improves restore speed.

4.5 Handling very large files

# Work around file size limits with split
pg_dump -Fp -U postgres -d mydb | gzip | split -b 2G - /backups/mydb_part_

# Restore from split files
cat /backups/mydb_part_* | gunzip | psql -U postgres -d mydb

# Backup over SSH tunnel — stream directly to backup server
pg_dump -Fc -U postgres -d mydb | ssh user@backup-server "cat > /backups/mydb.dump"

4.6 PostgreSQL 18 new options

PostgreSQL 18 added finer dump control: --sequence-data includes sequence data that was previously omitted by default; --statistics-only, --no-statistics, --no-data, and --no-schema let you control exactly what is exported; --no-policies excludes Row Level Security policies, useful when migrating to environments with different policy sets.

# Include sequence data (PostgreSQL 18+)
pg_dump -Fc --sequence-data -U postgres -d mydb -f mydb_with_seq.dump

# Statistics only (PostgreSQL 18+)
pg_dump --statistics-only -U postgres -d mydb -f stats_only.dump

# Exclude RLS policies (PostgreSQL 18+)
pg_dump -Fc --no-policies -U postgres -d mydb -f mydb_no_rls.dump

Always verify compatibility with your running PostgreSQL version against the official documentation before using these options.

5. pg_restore recipes

pg_restore restores custom, directory, and tar format archives. Plain SQL format is restored with psql.

5.1 Basic restore

# Full DB restore (database must already exist)
pg_restore -U postgres -d mydb /backups/mydb.dump

# Create database automatically during restore
pg_restore -U postgres -C -d postgres /backups/mydb.dump

# Stop immediately on error
pg_restore -U postgres -d mydb --exit-on-error /backups/mydb.dump

5.2 Parallel restore

Parallel restore works with both custom and directory format archives, regardless of whether parallel dump was used to create them.

# Restore with 4 parallel workers
pg_restore -U postgres -d mydb -j 4 /backups/mydb.dump

# Directory format parallel restore
pg_restore -U postgres -d mydb -j 4 -Fd /backups/mydb_dir/

Temporarily adjusting server parameters before restore can further improve speed.

-- Temporary settings before restore (must revert after completion)
ALTER SYSTEM SET shared_buffers = '4GB';
ALTER SYSTEM SET work_mem = '32MB';
ALTER SYSTEM SET max_wal_size = '32GB';
ALTER SYSTEM SET checkpoint_timeout = '60min';
ALTER SYSTEM SET maintenance_work_mem = '2GB';
ALTER SYSTEM SET autovacuum = 'off';
SELECT pg_reload_conf();

autovacuum = off affects data safety — re-enable it immediately after restore completes. Tune the values above to match your server specifications.

5.3 Selective restore

# Restore a specific table
pg_restore -U postgres -d mydb -t users /backups/mydb.dump

# Restore a specific schema
pg_restore -U postgres -d mydb -n analytics /backups/mydb.dump

# List archive contents without restoring
pg_restore --list /backups/mydb.dump

# Restore a specific section — rebuild indexes and constraints only
pg_restore -U postgres -d mydb --section=post-data /backups/mydb.dump

5.4 Plain SQL restore

# Basic restore
psql -U postgres -d mydb < backup.sql

# Stop immediately on error (recommended)
psql -U postgres -d mydb \
  -v ON_ERROR_STOP=1 \
  -X \
  < backup.sql

# Restore from compressed file
gunzip -c backup.sql.gz | psql -U postgres -d mydb

ON_ERROR_STOP=1 prevents partial loads by halting on any error during restore. Always set this for production restores.

6. pg_dumpall recipes

pg_dumpall backs up the entire cluster, including Roles and Tablespaces. It is essential for server migrations or DR environments where pg_dump alone cannot recover global objects.

6.1 Basic usage

# Full cluster backup
pg_dumpall -U postgres -f /backups/cluster_$(date +%Y%m%d).sql

# Global objects only (roles + tablespaces, no DB content)
pg_dumpall -U postgres --globals-only -f /backups/globals_$(date +%Y%m%d).sql

# Roles only
pg_dumpall -U postgres --roles-only -f /backups/roles.sql

# Tablespaces only
pg_dumpall -U postgres --tablespaces-only -f /backups/tablespaces.sql

6.2 Restore

# Full cluster restore (requires superuser)
psql -U postgres -f /backups/cluster_20260414.sql

# Recommended hybrid: restore global objects first, then individual DBs with pg_restore
psql -U postgres -f /backups/globals_20260414.sql
pg_restore -U postgres -C -d postgres /backups/mydb.dump

pg_dumpall outputs plain SQL only, so parallel restore is not possible. For large clusters, the recommended approach is to save global objects separately with pg_dumpall --globals-only and dump individual databases in parallel with pg_dump -Fd -j N.

7. Password security — Using .pgpass

Hard-coding passwords in backup scripts is a security risk. The .pgpass file keeps credentials out of source code.

# ~/.pgpass format: hostname:port:database:username:password
echo "localhost:5432:mydb:backup_user:securepassword" >> ~/.pgpass
chmod 600 ~/.pgpass

# Now runs without a password prompt
pg_dump -U backup_user -d mydb -Fc -f mydb.dump

Environment variables are also an option — but always unset after use.

export PGPASSWORD="securepassword"
pg_dump -U backup_user -d mydb -Fc -f mydb.dump
unset PGPASSWORD

PGPASSWORD can be visible in process listings, making it unsuitable as the default for long-running automation scripts. Prefer .pgpass or pg_service.conf.

8. Backup automation scripts

8.1 Daily backup script

#!/bin/bash
set -euo pipefail

DB_HOST="localhost"
DB_USER="backup_user"
DB_NAME="mydb"
BACKUP_DIR="/backups/postgresql"
RETENTION_DAYS=30
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.dump"
LOG_FILE="/var/log/pg_backup.log"

mkdir -p "${BACKUP_DIR}"

log() {
  echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "${LOG_FILE}"
}

log "Backup started: ${DB_NAME}"

if pg_dump -Fc \
    -h "${DB_HOST}" \
    -U "${DB_USER}" \
    -d "${DB_NAME}" \
    -f "${BACKUP_FILE}"; then
  FILE_SIZE=$(du -sh "${BACKUP_FILE}" | cut -f1)
  log "Backup complete: ${BACKUP_FILE} (${FILE_SIZE})"
else
  log "ERROR: Backup failed!"
  exit 1
fi

DELETED=$(find "${BACKUP_DIR}" -name "${DB_NAME}_*.dump" \
  -mtime +${RETENTION_DAYS} -delete -print | wc -l)
log "Removed ${DELETED} expired backups (retention: ${RETENTION_DAYS} days)"

8.2 Cron registration

# Daily at 2 AM
0 2 * * * /usr/local/bin/pg_backup_daily.sh >> /var/log/pg_backup.log 2>&1

# Global objects weekly (Sunday at 1 AM)
0 1 * * 0 pg_dumpall -U postgres --globals-only \
  -f /backups/globals_$(date +\%Y\%m\%d).sql

Automation scripts must include success/failure logging, a retention policy, and a recovery test plan. Verifying that a backup file was created is not enough.

9. Performance optimization for large databases

Each -j N parallel worker opens its own database connection (N + 1 total including the coordinator). Confirm that max_connections has sufficient headroom and monitor CPU, memory, and IOPS as you tune the worker count.

# Optimized large-DB dump on an 8-core server
pg_dump \
  -Fd \
  -j 6 \
  -Z 0 \
  -U postgres \
  -d large_db \
  -f /backups/large_db_dir/

# Parallel restore
pg_restore \
  -Fd \
  -j 6 \
  -U postgres \
  -d large_db \
  /backups/large_db_dir/

More parallelism does not always mean faster. When a database has few tables and one giant table dominates, parallel gains are limited. Compression tuning (-Z 0 vs default) depends on whether the bottleneck is CPU or I/O — measure in your environment before committing.

10. Limits of logical backup and when physical backup is required

Knowing when logical backup is the right tool — and when it is not — is as important as knowing how to use it.

Where logical backup excels

Selective restore of specific tables or schemas
PostgreSQL major-version upgrades (lower to higher version)
Moving data to a different OS or hardware architecture
Cloning data into dev or test environments
Long-term portable snapshots

Where physical backup is required

PITR (Point-in-Time Recovery)
Fast full restore of large databases (disaster recovery, RTO targets)
Initializing a Streaming Replication standby
Environments with sub-minute RPO requirements

Relying on pg_dump alone for production backup coverage is risky. Treat logical backup as a complement to physical backup — not a substitute.

11. Closing — A backup that has never been tested is not a backup

pg_dump and pg_restore are easy to learn, but each option makes a meaningful difference when a real failure hits. Decide in advance on format, parallelism, and selective restore scope — then run actual recovery tests on a regular schedule.

"A backup you have never tested is not a backup."

The next part covers physical backup — pg_basebackup and WAL archiving — the foundation for production PITR.