Lv.3 IntermediatePostgreSQL

2026.05.1822 min readLv.3 Intermediate

SeriesPostgreSQL Backup & Recovery Complete Guide · Part 4View series hub

PostgreSQL Backup & Recovery Part 4 — PITR (Point-in-Time Recovery) Implementation Guide

PITR is not a setting to flip — it is an operational procedure that only succeeds when a base backup, a continuous WAL archive, a recovery target, and a verification step all align. Recovery target selection (from recovery_target_time to named restore points) determines recovery quality, and data verification in the pause state must come before service switchover. PITR reliability is determined by routine WAL archiving monitoring and regular recovery drills.

Series overview

Part 1 — Backup Fundamentals and Strategy

Part 2 — Logical Backup — pg_dump & pg_dumpall in Practice

Part 3 — Physical Backup — pg_basebackup and WAL Archiving

Part 4 — PITR (Point-in-Time Recovery) Implementation Guide (you are here)

Part 5 — Enterprise Backup Tools Compared — pgBackRest vs Barman vs WAL-G

Part 6 — Backup Automation & Monitoring, and Recovery Drills

Introduction — Why PITR is a time machine
PITR prerequisites checklist
Recovery target parameters explained
PITR step-by-step guide
Recovery target comparison
Timeline — The key concept behind PITR
Scenario-based recovery guides
Common PITR errors and fixes
Recovery drills — You have to actually do it
Closing

1. Introduction — Why PITR is a time machine

Part 3 covered how to configure pg_basebackup and WAL archiving. This part is about the moment all that preparation actually saves your data.

Imagine this scenario.

Situation: At 2:35 PM today, a developer accidentally ran
           DELETE on a critical table with no WHERE clause.

Last full backup:      Yesterday at 1 AM
Last WAL archived:     2 minutes ago (2:33 PM)

Question: Can we recover to the state at 2:34:59 PM?
Answer:   If WAL archiving is configured — yes.

That is PITR (Point-in-Time Recovery): replaying a base backup and the WAL accumulated after it, in order, to roll the database back to any point in the past.

2. PITR prerequisites checklist

All of the following must be in place before a PITR attempt.

□ wal_level = replica (or logical)
□ archive_mode = on
□ archive_command configured and verified working
□ Base backup created with pg_basebackup
□ Continuous WAL archive from base backup time to recovery target time
□ backup_manifest file present (PostgreSQL 13+, for pg_verifybackup)

The most common cause of PITR failure is a gap in the WAL archive. If archiving failures go undetected, the WAL chain will have holes exactly when you need it most, making it impossible to replay to the target point. That is why pg_stat_archiver monitoring, covered in Part 3, matters.

3. Recovery target parameters explained

PostgreSQL supports several ways to specify how far to replay WAL. Only one recovery target can be active — if multiple are set, the last one wins.

3.1 Time-based — recovery_target_time (most common)

# Recover to a specific timestamp (include timezone)
recovery_target_time = '2026-04-14 14:34:59 Asia/Seoul'

# UTC
recovery_target_time = '2026-04-14 05:34:59 UTC'

The most intuitive option and the one used most often in practice. You can specify the recovery point in human terms — "one minute before the incident."

3.2 Transaction ID-based — recovery_target_xid

# Recover up to just before a specific transaction ID
recovery_target_xid = '1234567'

Useful when you know the exact transaction ID that caused the incident. Replaying up to just before that transaction precisely excludes the corrupted change.

-- Find transaction ID in PostgreSQL logs after an incident
-- LOG: transaction 1234567 at ...
SELECT xmin, xmax, * FROM pg_class LIMIT 10;

3.3 LSN-based — recovery_target_lsn

# Recover to a specific WAL physical position
recovery_target_lsn = '0/15D68C50'

Finer-grained than a timestamp, but requires knowing the LSN value in advance.

-- Check current LSN
SELECT pg_current_wal_lsn();

3.4 Named restore point — recovery_target_name

-- Create a restore point before any risky operation
SELECT pg_create_restore_point('before_schema_migration_v2');
SELECT pg_create_restore_point('before_bulk_delete');

# Recover to that named restore point
recovery_target_name = 'before_schema_migration_v2'

Named restore points are far more readable than timestamps or XIDs. Make it a habit to create one before any large schema change, batch delete, or migration. pg_create_restore_point() has no meaningful overhead.

3.5 Immediate — recovery_target = 'immediate'

# Recover only to the point where the base backup became consistent
recovery_target = 'immediate'

Stops as soon as the base backup reaches a consistent state — no additional WAL is replayed.

3.6 recovery_target_action — behavior after recovery

One of the most important parameters: what the server does once it reaches the recovery target.

Value	Behavior	When to use
`pause` (default)	Pauses recovery, allows read connections	When you want to verify data before deciding
`promote`	Completes recovery, switches to operational mode immediately	When you want to resume service without manual verification
`shutdown`	Shuts down after recovery	When you need to preserve a snapshot post-recovery

# Recommended: pause first, verify data, then decide whether to promote
recovery_target_action = 'pause'

From the paused state:

-- Data looks correct → promote to operational mode
SELECT pg_wal_replay_resume();

-- Recovery point is wrong → stop, adjust target time, retry

4. PITR step-by-step guide

Scenario: a table was dropped at 2:35 PM on 2026-04-14. The goal is to recover to the state at 2:34:59 PM.

Step 1. Assess the situation and prepare for recovery

# Stop the service immediately (prevent further changes)
sudo systemctl stop postgresql

# Preserve the damaged data directory (do not delete — may be needed later)
mv /var/lib/postgresql/17/main \
   /var/lib/postgresql/17/main_DAMAGED_$(date +%Y%m%d_%H%M%S)

# Create a fresh data directory for recovery
mkdir /var/lib/postgresql/17/main
chown postgres:postgres /var/lib/postgresql/17/main

Preserving the damaged directory rather than deleting it immediately is important. You may need to reference it during recovery or for forensic investigation.

Step 2. Restore the base backup

# For tar-format backups
BACKUP_DIR="/backups/basebackup/20260414_010000"

# Extract base.tar.gz into the data directory
sudo -u postgres tar -xzf "${BACKUP_DIR}/base.tar.gz" \
  -C /var/lib/postgresql/17/main

# Create pg_wal directory if not included in the tar
mkdir -p /var/lib/postgresql/17/main/pg_wal

# For plain-format backups
# rsync -a "${BACKUP_DIR}/" /var/lib/postgresql/17/main/

Step 3. Verify WAL archive continuity

Confirm the WAL archive is continuous from the base backup through the recovery target. A gap in the chain will stop recovery.

# Find the first required WAL segment from backup_label
cat /var/lib/postgresql/17/main/backup_label
# START WAL LOCATION: 0/3000028 (file 000000010000000000000003)
# START TIME: 2026-04-14 01:00:05 KST

# Confirm that segment exists in the archive
ls /var/lib/postgresql/archive/ | grep "^000000010000000000000003"

# Check the most recent WAL files in the archive
ls -lht /var/lib/postgresql/archive/ | head -10

Step 4. Create recovery.signal and set recovery parameters

Since PostgreSQL 12, recovery.conf no longer exists. Create a recovery.signal file in the data directory and write recovery parameters into postgresql.conf or postgresql.auto.conf.

# Create recovery.signal (content can be empty)
sudo -u postgres touch /var/lib/postgresql/17/main/recovery.signal

# Append recovery parameters to postgresql.auto.conf
sudo -u postgres tee -a /var/lib/postgresql/17/main/postgresql.auto.conf << 'EOF'

# PITR recovery settings
# Command to fetch WAL files from the archive
restore_command = 'cp /var/lib/postgresql/archive/%f %p'

# Recovery target (just before the incident)
recovery_target_time = '2026-04-14 14:34:59 Asia/Seoul'

# Pause after reaching the target to allow data verification
recovery_target_action = 'pause'
EOF

On PostgreSQL 12 and later, recovery.conf is not recognized. Use recovery.signal and postgresql.auto.conf instead.

Step 5. Start recovery and monitor progress

# Start PostgreSQL — it enters recovery mode automatically
sudo systemctl start postgresql

# Monitor recovery progress in real time
sudo tail -f /var/log/postgresql/postgresql-17-main.log

Normal progress messages in the log:

LOG:  starting point-in-time recovery to 2026-04-14 14:34:59+09
LOG:  restored log file "000000010000000000000003" from archive
LOG:  redo starts at 0/3000028
LOG:  consistent recovery state reached at 0/3000100
LOG:  restored log file "000000010000000000000047" from archive
LOG:  recovery stopping before commit of transaction 1234568,
      time 2026-04-14 14:35:02.341+09
LOG:  pausing at the end of recovery
HINT: Execute pg_wal_replay_resume() to promote.

Check status during recovery:

-- Confirm the server is in recovery mode
SELECT pg_is_in_recovery();
-- t (true = recovery mode)

-- Check current replay position
SELECT pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();

Step 6. Verify data — the most critical step

With recovery paused at recovery_target_action = 'pause', verify that the data was restored correctly. If the recovery point is wrong, this is where you catch it.

-- Confirm the dropped table exists again
\dt public.*

-- Confirm data is present
SELECT COUNT(*) FROM orders;
SELECT MAX(created_at) FROM orders;

-- Check that the latest records are as expected
SELECT * FROM orders ORDER BY created_at DESC LIMIT 5;

Decision tree based on results:

✅ Data looks correct
   → Proceed to Step 7

❌ Recovery point too early (data missing)
   → Stop server, increase recovery_target_time, retry

❌ Recovery point too late (incident data included)
   → Stop server, decrease recovery_target_time, retry

When you need to adjust the target time:

# Stop the server
sudo systemctl stop postgresql

# Edit recovery_target_time in postgresql.auto.conf
nano /var/lib/postgresql/17/main/postgresql.auto.conf

# Re-create recovery.signal (auto-deleted when recovery completes)
sudo -u postgres touch /var/lib/postgresql/17/main/recovery.signal

# Restore the base backup again and retry

Step 7. Complete recovery — switch to operational mode

Once data verification passes, finalize recovery and switch to operational mode.

-- Promote to operational mode
SELECT pg_wal_replay_resume();

# Confirm recovery.signal has been removed
ls /var/lib/postgresql/17/main/recovery.signal
# ls: cannot access ...: No such file or directory

# Confirm operational mode
psql -U postgres -c "SELECT pg_is_in_recovery();"
# f (false = normal operational mode)

Step 8. Required post-recovery tasks

-- Reset sequences (recovery may have rolled them back to past values)
SELECT setval('orders_id_seq', (SELECT MAX(id) FROM orders));

-- Refresh statistics
VACUUM ANALYZE;

# Take a new base backup immediately
# A new timeline was created after recovery — a new backup is mandatory
pg_basebackup -h localhost -U backup_user \
  -D /backups/basebackup/$(date +%Y%m%d_%H%M%S)_post_recovery \
  -Ft -Xs -P --checkpoint=fast

After a PITR recovery, PostgreSQL starts a new timeline. WAL from the previous timeline cannot track changes on the new one, so the recovery point must become the new backup baseline.

5. Recovery target comparison

Situation	Recommended target	Example
"Recover to just before a specific time"	`recovery_target_time`	`'2026-04-14 14:34:59 Asia/Seoul'`
"Recover before a specific transaction"	`recovery_target_xid`	`'1234567'`
"Roll back a failed migration"	`recovery_target_name`	`'before_migration_v3'`
"Recover to just after the backup"	`recovery_target = 'immediate'`	`recovery_target = 'immediate'`
"Recover to the latest possible state"	(no target specified)	No parameter — replay all available WAL

6. Timeline — The key concept behind PITR

Understanding timelines is essential for PITR, especially in complex recovery scenarios.

When you resume operations after a PITR recovery, PostgreSQL starts a new timeline. This means multiple recovery attempts from the same base backup do not interfere with each other.

# Check timeline history files in the WAL archive
ls /var/lib/postgresql/archive/*.history
# Example: 00000002.history ← Timeline 2 was created

# Inspect the history file
cat /var/lib/postgresql/archive/00000002.history
# 1    0/3F000000    no recovery target specified
# → Replayed through Timeline 1 up to LSN 0/3F000000, then started Timeline 2

7. Scenario-based recovery guides

Scenario A: Accidental table DROP

restore_command = 'cp /var/lib/postgresql/archive/%f %p'
recovery_target_time = '2026-04-14 14:34:59 Asia/Seoul'
recovery_target_action = 'pause'

Scenario B: Rolling back a failed schema migration

-- Run this before every risky migration
SELECT pg_create_restore_point('before_v3_migration');

restore_command = 'cp /var/lib/postgresql/archive/%f %p'
recovery_target_name = 'before_v3_migration'
recovery_target_action = 'pause'

Scenario C: Recover to the latest available state (replay all WAL)

# No recovery target — replay until WAL is exhausted
restore_command = 'cp /var/lib/postgresql/archive/%f %p'
recovery_target_action = 'promote'

Scenario D: Create a verification copy without production downtime

Restore a copy to a separate server, verify data there, extract only what you need, and apply it to the production database. This is the safest approach — production stays online.

# 1. Restore base backup on a separate server
mkdir /restore/verify_db
tar -xzf /backups/basebackup/20260414_010000/base.tar.gz \
  -C /restore/verify_db

# 2. Configure recovery
touch /restore/verify_db/recovery.signal
cat >> /restore/verify_db/postgresql.auto.conf << 'EOF'
restore_command = 'cp /var/lib/postgresql/archive/%f %p'
recovery_target_time = '2026-04-14 14:34:59 Asia/Seoul'
recovery_target_action = 'pause'
port = 5433
EOF

# 3. Start the recovery instance
pg_ctl -D /restore/verify_db start

# 4. Extract the recovered table
pg_dump -h localhost -p 5433 -U postgres -t orders \
  -Fc -f /tmp/orders_recovered.dump

# 5. Apply the extracted data to the production DB selectively
pg_restore -h localhost -p 5432 -U postgres \
  -d production_db -t orders \
  --data-only /tmp/orders_recovered.dump

8. Common PITR errors and fixes

Error 1: WAL file not found in archive

FATAL: could not find file "000000010000000000000047" in archive

Cause: the WAL file is missing from the archive, or restore_command is incorrect.

# Test restore_command manually
cp /var/lib/postgresql/archive/000000010000000000000047 /tmp/test_wal
echo $?   # must be 0

# Check if the file exists in the archive
ls -la /var/lib/postgresql/archive/ | grep "000000010000000000000047"

Error 2: Recovery target time is beyond the archive range

FATAL: recovery ended before configured recovery target was reached

Cause: recovery_target_time is set to a point after the last WAL in the archive.

# Inspect the last WAL file in the archive
pg_waldump -p /var/lib/postgresql/archive \
  $(ls /var/lib/postgresql/archive | grep -v history | tail -1) \
  | tail -5

Error 3: recovery.conf not recognized (PostgreSQL 12+)

WARNING: ignoring recovery.conf; use postgresql.conf or
         postgresql.auto.conf for recovery parameters

Fix: remove recovery.conf, create recovery.signal, and write parameters into postgresql.auto.conf.

Error 4: Duplicate key errors after recovery

ERROR: duplicate key value violates unique constraint "orders_pkey"

Cause: sequences rolled back to past values after recovery, causing conflicts on new INSERTs.

-- Reset each sequence to the current MAX value
SELECT setval(pg_get_serial_sequence('orders', 'id'),
              (SELECT MAX(id) FROM orders));

9. Recovery drills — You have to actually do it

Running a PITR for the first time during a real incident guarantees mistakes. A restore drill at least once per quarter is strongly recommended.

Drill checklist:

□ Base backup restored successfully in an isolated environment
□ PITR completed to the target time successfully
□ Data integrity verified after recovery (row counts, latest record timestamps)
□ recovery.signal file confirmed deleted after completion
□ Total recovery time measured (RTO verification)
□ New base backup created immediately after recovery
□ Recovery runbook updated

10. Closing — Only a prepared recovery is a real recovery

PITR is not a nice-to-have feature — it is the mandatory safety net for production PostgreSQL operations.

Three principles of readiness:

Monitor WAL archiving continuously — silent failures are the most dangerous
Create a restore point before every risky operation — pg_create_restore_point() costs nothing
Practice recovery regularly — the procedure must become muscle memory

"Disasters arrive without warning. Recovery succeeds only for those who prepared."

The next part compares the three enterprise backup tools that go beyond what the built-in utilities offer — pgBackRest, Barman, and WAL-G.