Lv.2 BeginnerPostgreSQL

2026.05.1015 min readLv.2 Beginner

SeriesPostgreSQL Backup & Recovery Complete Guide · Part 1View series hub

PostgreSQL Backup & Recovery Part 1 — Backup Fundamentals and Strategy

A cron job running and files piling up does not mean your backup works — it means you haven't tested recovery yet. PostgreSQL offers three fundamentally different backup approaches: logical backup (pg_dump, pg_dumpall) for portability and selective restore, physical backup (pg_basebackup) for full-cluster disaster recovery, and WAL archiving to push RPO down to seconds. The right choice is not about picking a tool — it is about working backwards from your RPO and RTO. Part 1 maps the full strategy: the 3-2-1 principle, monitoring, and retention policy included.

Series overview

Part 1 — Backup Fundamentals and Strategy (you are here)

Part 2 — Logical Backup — pg_dump & pg_dumpall in Practice

Part 3 — Physical Backup — pg_basebackup and WAL Archiving

Part 4 — PITR (Point-in-Time Recovery) Implementation Guide

Part 5 — Enterprise Backup Tools Compared — pgBackRest vs Barman vs WAL-G

Part 6 — Backup Automation & Monitoring, and Recovery Drills

Introduction — The "we have a backup" illusion
The three PostgreSQL backup approaches
RPO & RTO — Where backup strategy begins
The 3-2-1 backup rule
The PostgreSQL backup ecosystem in 2026
The five most common backup mistakes
Choosing the right backup approach
Closing — Recoverability is the backup

1. Introduction — The "we have a backup" illusion

Many teams share a familiar experience.

"The cron job is running, files are showing up, the dashboard is green. We have a backup."

Yet when a real failure strikes, the difference between teams that recover successfully and teams that do not is not whether a backup file exists — it is whether the recovery procedure has been tested.

The PostgreSQL documentation makes this explicit.

Backups should be performed on a regular basis. Although the procedure is essentially simple, it is important to have a clear understanding of the underlying techniques and assumptions. — postgresql.org/docs/current/backup

In 2026, PostgreSQL underpins SaaS platforms, analytics pipelines, and high-availability systems at scale. The cost of a backup failure has grown accordingly. Part 1 reframes backup not as file creation but as verified recoverability, and lays out a complete strategic map.

2. The three PostgreSQL backup approaches

PostgreSQL offers three fundamentally different backup methods. Each solves a different problem, and no single method covers every scenario.

2.1 Logical Backup

Logical backup extracts data as SQL statements. The primary tools are pg_dump and pg_dumpall.

Key characteristics

Selective backup and restore at the table or schema level
Excellent portability across PostgreSQL versions (lower → higher version migration)
Slow on large databases
Not suitable for full-cluster disaster recovery (roles, tablespaces not included in pg_dump)

# Single database backup (custom format — supports parallel restore)
pg_dump -U postgres -d mydb -Fc -f mydb.dump

# Full cluster (includes roles and tablespaces)
pg_dumpall -U postgres > full_cluster.sql

2.2 Physical Backup

Physical backup copies the entire PostgreSQL data directory at the file-system level. pg_basebackup is the built-in tool.

Key characteristics

Fast even for large databases
Optimal for full-cluster recovery (disaster recovery scenarios)
The required foundation for PITR (Point-in-Time Recovery)
Cannot restore individual tables or objects
Only compatible within the same PostgreSQL major version

# Physical backup with WAL streaming and recovery config auto-generated
pg_basebackup -h localhost -U replicator \
  -D /backups/basebackup \
  -Fp -Xs -P -R

2.3 Continuous Archiving (WAL Archiving)

Continuous archiving copies WAL (Write-Ahead Log) files as they are generated, enabling recovery to any arbitrary point in time. It is the core mechanism behind PITR.

Key characteristics

Recovery to any point in time regardless of the backup schedule
Can bring RPO down to seconds
Must be used in combination with a physical base backup
High operational complexity

# postgresql.conf — enable WAL archiving
archive_mode = on
archive_command = 'cp %p /mnt/wal-archive/%f'
# For cloud environments:
# archive_command = 'wal-g wal-push %p'

Approach comparison

Approach	Primary tool	Strengths	Weaknesses
Logical backup	pg_dump, pg_dumpall	Portability, selective restore	Slow on large DBs, not for full-cluster recovery
Physical backup	pg_basebackup	Fast full-cluster recovery, PITR foundation	Same major version only, no object-level restore
WAL archiving	archive_command	RPO in seconds, core PITR mechanism	High operational complexity

3. RPO & RTO — Where backup strategy begins

Before choosing a tool, two concepts must be defined.

Concept	Definition	Example
RPO (Recovery Point Objective)	Maximum tolerable data loss, expressed as time	RPO = 1 hour → backups must run at least hourly
RTO (Recovery Time Objective)	Maximum tolerable time to restore service after a failure	RTO = 4 hours → service must be back within 4 hours

These two targets determine the backup method, frequency, and tooling. The correct sequence is to derive the approach from RPO/RTO first, not to pick a tool and work backwards.

RPO ≥ 24 hours?  → daily pg_dump is sufficient
RPO ≤ 1 hour?    → incremental backups or WAL archiving required
RPO = seconds?   → continuous WAL archiving (PITR) is mandatory

Replicas are not backups. A replica improves availability, but if data is accidentally deleted on the primary, that deletion replicates to the replica immediately. From an RPO perspective, a replica cannot substitute for a backup.

4. The 3-2-1 backup rule

The 3-2-1 rule is the industry-standard backup retention framework for production PostgreSQL environments.

Number	Meaning
3	Maintain 3 copies of data (including the original)
2	Store on 2 different media or storage types
1	Keep 1 copy off-site

Real-world application

Production database (original)
Local NAS or a separate server (physical separation)
AWS S3 / GCS / Azure Blob (geographic separation)

This principle covers hardware failure, accidental deletion, and physical disasters such as fire or flooding. A backup configuration that does not meet all three conditions leaves specific failure scenarios uncovered.

5. The PostgreSQL backup ecosystem in 2026

As of PostgreSQL 18, with version 17 still widely used in production estates, the backup tooling ecosystem is mature.

Built-in tools

Tool	Type	Primary use
pg_dump	Logical	Single-database selective backup
pg_dumpall	Logical	Full cluster including roles
pg_basebackup	Physical	Full-cluster physical backup, PITR foundation
pg_restore	Recovery	Restore custom/directory format dumps

Key open-source third-party tools

Tool	Notable features	Recommended environment
pgBackRest	Parallel processing, incremental backup, encryption, cloud integration	Large-scale production, on-premises or cloud
Barman	Centralized multi-server management, Python-based	Enterprise, managing many PostgreSQL instances
WAL-G	Cloud-native, direct archiving to S3/GCS/Azure	Cloud-first environments
Databasus	Web UI, Docker, scheduling automation	DevOps teams including non-technical operators

There is no universally "best" tool. The right choice depends on data scale, RPO/RTO targets, cloud vs. on-premises environment, and available operational expertise.

6. The five most common backup mistakes

❌ Mistake 1: Recovery has never been tested

A backup file that fails during restore is equivalent to having no backup at all. Periodic recovery testing must be built into the backup process itself — not deferred to "when something breaks." Schedule a full recovery drill at least once per quarter.

❌ Mistake 2: Treating a replica as a backup

A replica improves availability but does not replace a backup. If data is accidentally deleted on the primary, that deletion propagates to the replica immediately. Replicas and backups serve different roles.

❌ Mistake 3: Assuming pg_dump is enough

pg_dump captures only a logical snapshot of a single database. Cluster-level configuration such as roles and tablespaces is not included. Disaster recovery scenarios that require full-cluster restore need pg_dumpall or a physical backup.

❌ Mistake 4: No backup monitoring

Without alerts, you cannot know whether a cron job failed or whether WAL archiving is falling behind. Every backup process must include success/failure monitoring. The most common pattern is discovering that backups have been silently failing for days — at the exact moment of a failure.

❌ Mistake 5: No retention policy

Retaining old backups indefinitely causes storage costs to spiral; too short a retention window means a logical error discovered late cannot be recovered. Define explicit retention periods (e.g., 30 days daily, 3 months weekly, 1 year monthly) as policy.

7. Choosing the right backup approach

The selection flow below is driven by RPO/RTO targets and deployment environment.

In practice, combining logical and physical backup is the recommended approach rather than relying on a single tool.

Physical backup → fast full-cluster recovery (disaster recovery)
Logical backup → portability and selective restore (table-level, cross-version migration)

The two methods solve different problems, so running both widens coverage.

8. Closing — Recoverability is the backup

"The purpose of a backup is not to create a copy. It is to return your data to the state you need, at the moment you need it, within a predictable window of time."

A backup strategy begins with defining recovery scenarios, not selecting tools. Define RPO and RTO first, design a 3-2-1 retention structure, sustain it with monitoring and a retention policy, and validate the whole chain with regular recovery drills. Only when all four are in place can you genuinely say "we have a backup."

Part 2 covers pg_dump and pg_dumpall in depth — from format selection to automation.

PostgreSQL Backup & Recovery Part 1 — Backup Fundamentals and Strategy

Table of Contents

1. Introduction — The "we have a backup" illusion

2. The three PostgreSQL backup approaches

2.1 Logical Backup

2.2 Physical Backup

2.3 Continuous Archiving (WAL Archiving)

3. RPO & RTO — Where backup strategy begins

4. The 3-2-1 backup rule

5. The PostgreSQL backup ecosystem in 2026

Built-in tools

Key open-source third-party tools

6. The five most common backup mistakes

❌ Mistake 1: Recovery has never been tested

❌ Mistake 2: Treating a replica as a backup

❌ Mistake 3: Assuming pg_dump is enough

❌ Mistake 4: No backup monitoring

❌ Mistake 5: No retention policy

7. Choosing the right backup approach

8. Closing — Recoverability is the backup

References

Share This Article

PostgreSQL Backup & Recovery Complete Guide

Follow new posts via RSS