pg_probackup — manage backup and recovery of Postgres Pro database clusters
pg_probackup init -B backupdir
pg_probackup add-instance -B backupdir -D datadir --instance instance_name
pg_probackup del-instance -B backupdir --instance instance_name
pg_probackup set-config -B backupdir --instance instance_name [option...]
pg_probackup show-config -B backupdir --instance instance_name
pg_probackup backup -B backupdir --instance instance_name -b backup_mode [option...]
pg_probackup restore -B backupdir --instance instance_name [option...]
pg_probackup validate -B backupdir [option...]
pg_probackup show -B backupdir [option...]
pg_probackup delete -B backupdir --instance instance_name { -i backup_id | --wal | --expired }
pg_probackup archive-push -B backupdir --instance instance_name --wal-file-path %p --wal-file-name %f
pg_probackup archive-get -B backupdir --instance instance_name --wal-file-path %p --wal-file-name %f
pg_probackup version
pg_probackup help [command]
pg_probackup is a utility to manage backup and recovery of Postgres Pro database clusters. It is designed to perform periodic backups of the Postgres Pro instance that enable you to restore the server in case of a failure. pg_probackup supports Postgres Pro 9.5 or higher.
As compared to other backup solutions, pg_probackup offers the following benefits that can help you implement different backup strategies and deal with large amounts of data:
Choosing between full and page-level incremental backups to speed up backup and recovery
Implementing a single backup strategy for multi-server Postgres Pro clusters
Automatic data consistency checks and on-demand backup validation without actual data recovery
Managing backups in accordance with retention policy
Running backup, restore, and validation processes on multiple parallel threads
Storing backup data in a compressed state to save disk space
Taking backups from a standby server to avoid extra load on the master server
Extended logging settings
Custom commands to simplify WAL log archiving
To manage backup data, pg_probackup creates a backup catalog. This directory stores all backup files with additional meta information, as well as WAL archives required for point-in-time recovery. You can store backups for different instances in separate subdirectories of a single backup catalog.
Using pg_probackup, you can take full or incremental backups:
Full backups contain all the data files required to restore the database cluster from scratch.
Incremental backups only store the data that has changed since the previous backup. It allows to decrease the backup size and speed up backup operations. pg_probackup supports the following modes of incremental backups:
PAGE backup. In this mode, pg_probackup
scans all WAL files in the archive from the moment the previous
full or incremental backup was taken. Newly created
backups contain only the pages that were mentioned in
WAL records. This requires all the WAL
files since the previous backup to be present in the WAL
archive. If the size of these files is comparable
to the total size of the database cluster files, speedup is
smaller, but the backup still takes less space.
DELTA backup. In this mode,
pg_probackup reads all data files
in the data directory and copies only those pages that has changed
since the previous backup. Continuous archiving is not necessary
for this mode to operate. Note that this mode can impose read-only
I/O pressure equal to a full backup.
PTRACK backup. In this mode,
Postgres Pro tracks page changes
on the fly. Continuous archiving is not necessary for it to
operate. Each time a relation page is updated, this page is
marked in a special PTRACK bitmap for this relation. As one
page requires just one bit in the PTRACK fork, such bitmaps
are quite small. Tracking implies some minor overhead on the database
server operation, but speeds up incremental backups significantly.
Regardless of the chosen backup type, all backups taken with pg_probackup support the following archiving strategies:
Autonomous backups include all the files required to restore the cluster to a consistent state at the time the backup was taken. Even if continuous archiving is not set up, the required WAL segments are included into the backup.
Archive backups rely on continuous archiving. Such backups enable cluster recovery to an arbitrary point after the backup was taken (point-in-time recovery).
See Also
pg_probackup currently has the following limitations:
Creating backups from a remote server is currently not supported.
The server from which the backup was taken and the restored server must be compatible by the block_size and wal_block_size parameters and have the same major release number.
Microsoft Windows operating system is not supported.
Configuration files outside of Postgres Pro data directory are not included into the backup and should be backed up separately.
The pg_probackup package is provided as part of the Postgres Pro distribution. Once you have pg_probackup installed, complete the following setup:
pg_probackup stores all WAL and backup files in the corresponding subdirectories of the backup catalog.
To initialize the backup catalog, run the following command:
pg_probackup init -B backupdir
where backupdir is the backup catalog. If the
backupdir already exists, it must be
empty. Otherwise, pg_probackup returns an
error.
pg_probackup creates the backupdir backup
catalog, with the following subdirectories:
wal/ — directory for WAL
files.
backups/ — directory for backup files.
Once the backup catalog is initialized, you can add a new backup instance.
pg_probackup can store backups for multiple database clusters in a single backup catalog. To set up the required subdirectories, you must add a backup instance to the backup catalog for each database cluster you are going to back up.
To add a new backup instance, run the following command:
pg_probackup add-instance -Bbackupdir-Ddatadir--instanceinstance_name
where:
datadir is the data directory of the
cluster you are going to back up. To set up and use
pg_probackup, write access to this
directory is required.
instance_name is the name of the subdirectories
that will store WAL and backup files for this cluster.
pg_probackup creates the instance_name subdirectories under the
backups/ and wal/ directories of the backup catalog.
The backups/ directory
contains the instance_namepg_probackup.conf configuration file that controls backup and restore
settings for this backup instance.
For details on how to fine-tune pg_probackup configuration,
see the section called “Configuring pg_probackup”.
The backup catalog must belong to the file
system of the database server. The user launching
pg_probackup must have full access to the
contents of the backup catalog. If you specify the path to the
backup catalog in the BACKUP_PATH environment variable,
you can omit the corresponding option when running pg_probackup
commands.
Since pg_probackup uses a regular PostgreSQL
connection and the replication protocol,
pg_probackup commands require
connection options. To
avoid specifying these options each time on the command line, you can set
them in the pg_probackup.conf configuration
file using the set-config command.
For details, see the section called “Configuring pg_probackup”.
Although pg_probackup can be used by a superuser,
it is recommended to create a separate user or role with the minimum
permissions required for the chosen backup strategy. In these configuration instructions, the
backup role is used as an example.
To enable backups, the following rigths are required:
CREATE ROLE backup WITH LOGIN; GRANT USAGE ON SCHEMA pg_catalog TO backup; GRANT EXECUTE ON FUNCTION current_setting(text) TO backup; GRANT EXECUTE ON FUNCTION pg_is_in_recovery() TO backup; GRANT EXECUTE ON FUNCTION pg_start_backup(text, boolean, boolean) TO backup; GRANT EXECUTE ON FUNCTION pg_stop_backup() TO backup; GRANT EXECUTE ON FUNCTION pg_stop_backup(boolean) TO backup; GRANT EXECUTE ON FUNCTION pg_create_restore_point(text) TO backup; GRANT EXECUTE ON FUNCTION pg_switch_xlog() TO backup; GRANT EXECUTE ON FUNCTION txid_current() TO backup; GRANT EXECUTE ON FUNCTION txid_current_snapshot() TO backup; GRANT EXECUTE ON FUNCTION txid_snapshot_xmax(txid_snapshot) TO backup;
Depending on whether you are going to use autonomous or archive backup
strategies, Postgres Pro cluster configuration
will differ, as specified in the sections below. To back up the database cluster from a standby
server or create PTRACK backups,
additional setup is required. For details, see the section called “PTRACK Backup”
and the section called “Backup from Standby”.
To set up the cluster for autonomous backups, complete the following steps:
Grant the REPLICATION privilege
to the backup role:
ALTER ROLE backup WITH REPLICATION;
In the pg_hba.conf file, allow
replication on behalf of the backup role.
Modify the postgresql.conf configuration
file of the Postgres Pro server, as follows:
Make sure the max_wal_senders
parameter is set high enough to leave at least one
session available for the backup process.
Set the wal_level parameter to be
replica or higher.
To set up the cluster for archive backups, complete the following steps:
Configure the following parameters in
postgresql.conf to enable continuous
archiving on the Postgres Pro server:
Make sure the wal_level parameter is set to
replica or higher.
Set archive_mode to
on.
Set the archive_command variable, as follows:
archive_command = 'pg_probackup archive-push -Bbackupdir--instanceinstance_name--wal-file-path %p --wal-file-name %f'
where backupdir and instance_name refer to the already initialized
backup catalog instance for this database cluster.
For Postgres Pro 9.6 or higher, pg_probackup can take backups from a standby server. This requires the following additional setup:
On the standby server, allow replication connections:
Set the max_wal_senders and
hot_standby parameters in
postgresql.conf.
Configure host-based authentication in
pg_hba.conf.
On the master server, enable
full_page_writes in
postgresql.conf.
Archive backup from the standby server has the following limitations:
If the standby is promoted to the master during archive backup, the backup fails.
All WAL records required
for the backup must contain sufficient full-page writes. This
requires you to enable full_page_writes on
the master, and not to use a tool like
pg_compresslog as
archive_command to remove full-page writes
from WAL files.
PTRACK Backup
If you are going to use PTRACK backups, complete the following
additional steps:
In postgresql.conf, set
ptrack_enable to on.
Grant the rights to execute ptrack
functions to the backup role:
GRANT EXECUTE ON FUNCTION pg_ptrack_clear() TO backup; GRANT EXECUTE ON FUNCTION pg_ptrack_get_and_clear(oid, oid) TO backup;
The backup role must have access to all the
databases of the cluster.
This section describes pg_probackup commands. Some commands require mandatory options and can take additional options. For detailed descriptions, see the section called “Options”.
initSyntax:
pg_probackup init -B backupdir
Initializes the backupdir backup catalog that will store
backup copies, WAL archive, and meta information for the backed up
database clusters.
If the specified backupdir already exists, it must be empty. Otherwise,
pg_probackup displays a corresponding error message.
add-instanceSyntax:
pg_probackup add-instance -Bbackupdir-Ddatadir--instanceinstance_name
Initializes a new backup instance inside the backup catalog backupdir and generates the
pg_probackup.conf configuration file that
controls backup and restore settings for the cluster with the specified datadir data directory.
For details, see the section called “Adding a New Backup Instance”.
del-instanceSyntax:
pg_probackup del-instance -Bbackupdir--instanceinstance_name
Deletes all backup and WAL files associated with the specified instance.
set-configSyntax:
pg_probackup set-config -Bbackupdir--instanceinstance_name[--log-level-console=log_level] [--log-level-file=log_level] [--log-filename=log_filename] [--error-log-filename=error_log_filename] [--log-directory=log_directory] [--log-rotation-size=log_rotation_size] [--log-rotation-age=log_rotation_age] [-ddbname] [-hhost] [-pport] [-Uusername] [--master-db=dbname] [--master-host=host] [--master-port=port] [--master-user=username] [--retention-redundancy=redundancy][--retention-window=window] [--replica-timeout=timeout]
Adds the specified connection,
retention, logging or
replica settings into the pg_probackup.conf
configuration file, or modifies the previously defined values.
show-configSyntax:
pg_probackup show-config -Bbackupdir--instanceinstance_name
Displays the contents of the pg_probackup.conf
configuration file located in the directory.
To edit backupdir/backups/instance_namepg_probackup.conf, use the set-config command.
It is not allowed to edit pg_probackup.conf directly.
backupSyntax:
pg_probackup backup -Bbackupdir-bbackup_mode--instanceinstance_name[-C] [--stream [-Sslot_name]] [--backup-pg-log] [-archive-timeout=timeout] [--delete-expired] [-ddbname] [-hhost] [-pport] [-Uusername] [-w] [--master-db=dbname] [--master-host=host] [--master-port=port] [--master-user=username] [-jnum_threads][--progress] [-q] [-v]
Creates a backup copy of the Postgres Pro instance.
The backup_mode
option specifies the backup mode to use. For details, see the section called “Creating a Backup”.
restoreSyntax:
pg_probackup restore -Bbackupdir--instanceinstance_name[-Ddatadir] [ -ibackup_id| --immediate | [{--time=time| --xid=xid| --recovery-target-name=recovery_target_name} [--inclusive=boolean]]][--timeline=timeline] [-TOLDDIR=NEWDIR] [--recovery-target-action=pause|promote|shutdown] [-R | --write-recovery-conf] [-jnum_threads] [--progress] [-q] [-v]
Restores the Postgres Pro instance from a backup copy
located in the backupdir backup catalog.
If you specify a recovery target option,
pg_probackup restores the database cluster up to
the corresponding recovery target. Otherwise, the most recent backup is used.
validateSyntax:
pg_probackup validate -Bbackupdir[--instanceinstance_name[ -ibackup_id| [{--time=time| --xid=xid| --recovery-target-name=recovery_target_name} [--inclusive=boolean]]]] [--timeline=timeline] [-jnum_threads] [--progress] [-q] [-v]
Verifies that all the files required to restore the cluster are present and not corrupted.
If you specify the instance_name without any additional options,
pg_probackup validates the most recent backup available in this backup instance.
If you specify the instance_name with a
recovery target option or a backup_id,
pg_probackup checks whether it is possible to restore the cluster
using these options.
If instance_name is not specified, pg_probackup validates
all backups available in the backup catalog.
showSyntax:
pg_probackup show -Bbackupdir[--instanceinstance_name[-ibackup_id]]
Shows the contents of the backup catalog.
If instance_name and backup_id are specified,
shows detailed information about this backup.
deleteSyntax:
pg_probackup delete -Bbackupdir--instanceinstance_name{-ibackup_id| --wal | --expired}
Deletes backup or WAL files of the specified backup instance
from the backupdir backup catalog:
The -i option removes the specified backup copy.
The wal option removes the WAL files that are no longer required to restore the cluster from any of the existing backups.
The expired option removes the backups that are expired according to the current retention policy.
archive-pushSyntax:
pg_probackup archive-push -Bbackupdir--instanceinstance_name--wal-file-path %p --wal-file-name %f'
Stores WAL files in the corresponding subdirectory of the backup catalog.
Can be set as archive_command in postgresql.conf to perform
archive backups. In addition to copying files, this command also validates the instance by
instance_name, system-identifier and PGDATA.
If parameters of the backup instance and the cluster do not match, this command will fail with the following error message:
“Refuse to push WAL segment segment_name into archive. Instance parameters mismatch.”
For each WAL file moved to the backup catalog, you will see the following message in Postgres Pro logfile:
“pg_probackup archive-push completed successfully”.
archive-getSyntax:
pg_probackup archive-get -Bbackupdir--instanceinstance_name--wal-file-path %p --wal-file-name %f'
Moves WAL files from the corresponding subdirectory of the backup catalog to the cluster's write-ahead log location.
This command is automatically set by pg_probackup as archive_command in
recovery.conf when restoring archive backups. You do not need to set it manually.
versionSyntax:
pg_probackup version
Prints pg_probackup version.
helpSyntax:
pg_probackup help [command]
Displays the synopsis of pg_probackup commands. If one of the pg_probackup commands is specified, shows detailed information about the options that can be used with this command.
This section describes all command-line options for
pg_probackup commands.
If the option value can be derived from an environment variable,
this variable is specified below the command-line option, in the uppercase.
Some values can be taken from the pg_probackup.conf
configuration file located in the backup catalog. For details, see the section called “Configuring pg_probackup”.
If an option is specified using more than one method, command-line input has the highest priority, while the pg_probackup.conf settings have the lowest priority.
-B directory--backup-path=directoryBACKUP_PATH
Specifies the absolute path to the backup catalog. Backup catalog is a directory where all
backup files and meta information is stored. Since this
option is required for most of the pg_probackup commands,
you are recommended to specify it once in the BACKUP_PATH environment variable.
In this case, you do not need to use this option each time on the command line.
-D directory--pgdata=directoryPGDATA
Specifies the absolute path to the data directory of the database cluster.
This option is mandatory only for the init command.
Other commands can take its value from the PGDATA
environment variable, or from the pg_probackup.conf configuration file.
-i backup_id-backup-id=backup_idSpecifies the unique identifier of the backup.
-j num_threads--threads=num_threadsSets the number of parallel threads for backup, recovery, and backup validation processes.
--progressShows the progress of operations.
-q--quietEnables the silent mode that does not display any messages about the current process.
-v--verbosePrints detailed information about the current process.
The following options can be used together with the backup command.
-b mode--backup-mode=modeSpecifies the backup mode to use. Possible values are:
FULL — creates a full backup that contains all the data files
of the cluster to be restored.
DELTA — reads all data files in the
data directory and creates an incremental backup for pages
that have changed since the previous backup.
PAGE — creates an incremental PAGE backup
based on the WAL files that have changed since the previous
full or incremental backup was taken.
PTRACK — creates an incremental PTRACK backup
tracking page changes on the fly.
For details, see the section called “Creating a Backup”.
-C--smooth-checkpointSMOOTH_CHECKPOINTSpreads out the checkpoint over a period of time. By default, pg_probackup tries to complete the checkpoint as soon as possible.
--streamMakes an autonomous backup that includes all the necessary WAL files by streaming them from the database server via replication protocol.
-S slot_name--slot=slot_name
Specifies the replication slot for WAL streaming.
This option can only be used together with the --stream option.
--backup-pg-log
Includes the pg_log directory into the backup.
This directory usually contains log messages. By default, pg_log directory is excluded.
--archive-timeout=wait_timeSets the timeout for WAL segment archiving, in seconds. By default, pg_probackup waits 300 seconds.
--delete-expired
After a backup copy is successfully created, deletes backups that are
expired according to the current retention policy.
You can also clean up the expired backups by running the delete
command with the expired option. For details, see
the section called “Configuring Backup Retention Policy”.
--immediateStops recovery as soon as a consistent state is reached.
Alternatively, you can specify the backup ID up to which to restore the data, or one of the recovery target options.
--recovery-target-action=pause|promote|shutdown
Specifies the action the server should take when the recovery
target is reached, similar to the recovery_target_action
option in the recovery.conf configuration file.
Default: paused
-R | --write-recovery-conf
Writes a minimal recovery.conf in the output directory
to facilitate setting up a standby server. The password is not included.
If the replication connection requires a password, you must specify
the password manually.
-T OLDDIR=NEWDIR--tablespace-mapping=OLDDIR=NEWDIR
Relocates the tablespace from the OLDDIR to the NEWDIR directory
at the time of recovery. Both OLDDIR and NEWDIR must be absolute paths.
If the path contains the equals sign (=), escape it with a backslash.
This option can be specified multiple times for multiple tablespaces.
--timeline=timelineSpecifies a particular timeline to restore the cluster into. By default, the latest available timeline is used.
If the archive backup strategy is configured, you can use one of these options
together with restore or validate
commands to specify the moment up to which the database cluster must be restored.
--recovery-target-name=recovery_target_nameSpecifies a named savepoint up to which to restore the cluster data.
--time=timeSpecifies the timestamp up to which recovery will proceed.
--xid=xidSpecifies the transaction ID up to which recovery will proceed.
--inclusive=boolean
Specifies whether to stop just after the specified recovery target (true),
or just before the recovery target (false).
This option can only be used together with recovery-target-name,
time, or xid options. The default value is taken
from the recovery_target_inclusive variable.
--walDeletes WAL files that are no longer required to restore the cluster from any of the existing backups.
--expired
Deletes backups that do not conform to the retention policy defined in the
pg_probackup.conf configuration file. For details, see
the section called “Configuring Backup Retention Policy”.
Retention options can only be used together with the set-config command. For details, see the section called “Configuring Backup Retention Policy”.
--retention-redundancy=redundancySpecifies the number of full backup copies to keep in the data directory. Must be a positive integer.
--retention-window=windowNumber of days of recoverability.
--log-level-console=log_level
Controls which message levels are sent to the console log.
Valid values are verbose, log,
info, notice, warning, error,
fatal, panic, and off.
Each level includes all the levels that follow it.
The later the level, the fewer messages are sent.
The off level disables console logging.
Default: info
--log-level-file=log_level
Controls which message levels are sent to a log file.
Valid values are verbose, log,
info, notice, warning, error,
fatal, panic, and off.
Each level includes all the levels that follow it.
The later the level, the fewer messages are sent.
The off level disables file logging.
Default: off
--log-filename=log_filename
Defines the file names of the created log files. The file names are treated as a
strftime pattern, so you can use %-escapes to specify
time-varying file names.
This option takes effect if file logging is enabled
by the log-level-file option.
Default: pg_probackup.log
--error-log-filename=error_log_filename
Defines the file names of log files for error
messages. The filenames are treated as a strftime pattern,
so you can use %-escapes to specify time-varying file names.
If error-log-filename is not set, pg_probackup
writes all error messages to stderr.
Default: none
--log-directory=log_directoryDefines the directory in which log files will be created. You must specify the absolute path. This directory is created lazily, when the first log message is written.
Default: $BACKUP_PATH/log/
--log-rotation-size=log_rotation_size
Maximum size of an individual log file.
If this value is reached, the log file is rotated once
a pg_probackup command is launched,
except help and version commands.
The zero value disables size-based rotation.
Default: 0
--log-rotation-age=log_rotation_age
Maximum lifetime of an individual log file.
If this value is reached, the log file is rotated once
a pg_probackup command is launched,
except help and version commands.
The time of the last log file creation is stored in
$BACKUP_PATH/log/log_rotation.
The zero value disables time-based rotation.
Default: 0
-d dbname--dbname=dbnamePGDATABASE
Specifies the name of the database to connect to. The connection is used
only for managing backup process, so you can connect to any existing database.
If this option is not provided on the command line, PGDATABASE
environment variable, or the pg_probackup.conf configuration file,
pg_probackup tries to take this value from
the PGUSER environment variable, or from the current user name if PGUSER variable is not set.
-h host--host=hostPGHOSTSpecifies the host name of the system on which the server is running. If the value begins with a slash, it is used as a directory for the Unix domain socket.
-p port--port=portPGPORTSpecifies the TCP port or the local Unix domain socket file extension on which the server is listening for connections.
-U username--username=usernamePGUSERUser name to connect as.
-w--no-password
Disables a password prompt. If the server requires
password authentication and a password is not available by
other means such as a .pgpass file, the
connection attempt will fail. This option can be useful in
batch jobs and scripts where no user is present to enter a
password.
This section describes the options required to take a backup from replica. Connection options are needed to create a restore point (it can only be done at master), which will be used to determine recovery time — the earliest moment for which you can restore a consistent state of the database cluster.
--master-db=dbname
Specifies the name of the database on the master server to connect to. The connection is used
only for managing the backup process, so you can connect to any existing database.
Can be set in the pg_probackup.conf using the set-config command.
Default: postgres, the default Postgres Pro dbname.
--master-host=hostSpecifies the host name of the system on which the master server is running.
--master-port=port
Specifies the TCP port or the local Unix domain socket file
extension on which the master server is listening for connections.
Default: 5432, the Postgres Pro default port.
--master-user=username
User name to connect as.
Default: postgres, the Postgres Pro default user name.
--replica-timeout=timeout
Wait time for WAL segment streaming via replication, in seconds.
By default, pg_probackup waits 300 seconds.
You can also define this parameter in the pg_probackup.conf
configuration file using the set-config command.
--wal-file-path=wal_file_path %p
Provides the path to the WAL file in archive_command and restore_command
used by pg_probackup. The %p variable is required for correct processing.
--wal-file-name=wal_file_name %f
Provides the name of the WAL file in archive_command and restore_command
used by pg_probackup. The %f variable is required for correct processing.
To create a backup, run the following command:
pg_probackup backup -Bbackupdir--instanceinstance_name-bbackup_mode
where backup_mode can take one of the following values:
FULL — creates a full backup that contains all the data files
of the cluster to be restored.
DELTA — reads all data files in the
data directory and creates an incremental backup for pages
that have changed since the previous backup.
PAGE — creates an incremental PAGE backup
based on the WAL files that have changed since the previous
full or incremental backup was taken.
PTRACK — creates an incremental PTRACK backup
tracking page changes on the fly.
When restoring a cluster from an incremental backup, pg_probackup relies on the previous full backup to restore all the data files first. Thus, you must create at least one full backup before taking incremental ones.
If you have configured PTRACK backups,
pg_probackup clears PTRACK
bitmap of the relation being processed each time a full or an incremental backup is taken.
Thus, the next incremental PTRACK backup
contains only the pages that have changed since the previous
backup. If a backup failed or was interrupted, some relations
can already have their PTRACK
forks cleared, so the next incremental backup
will be incomplete. The same is true if ptrack_enable was
turned off for some time. In this case, you must take a full
backup before the next incremental PTRACK backup.
To make a backup autonomous, add the --stream option to the above command. For example, to create a full autonomous backup, run:
pg_probackup backup -Bbackupdir--instanceinstance_name-b FULL --stream
Autonomous backups include all the WAL segments required to restore the cluster to a consistent state at the time the backup was taken. To restore a cluster from an incremental autonomous backup, pg_probackup still requires the full backup and all the incremental backups it depends on.
Even if you are using continuous archiving, autonomous backups can still be useful in the following cases:
Autonomous backups can be restored on the server that has no file access to WAL archive.
Autonomous backups enable you to restore the cluster state at the point in time for which WAL files are no longer available.
When checksums are enabled for the database cluster, pg_probackup uses this information to check correctness of data files. While reading each page, pg_probackup checks whether the calculated checksum coincides with the checksum stored in the page. This guarantees that the backup is free of corrupted pages. Note that pg_probackup reads database files from disk and under heavy write load during backup it can show false positive checksum failures because of partial writes.
Even if page checksums are disabled, pg_probackup calculates checksums for each file in a backup. Checksums are checked immediately after backup is taken and right before restore, to detect possible backup corruptions.
To ensure that all the required backup files are present and can be used to restore
the database cluster, you can run the validate command with the exact
recovery target options
you are going to use for recovery. If you omit all the parameters, all backups are
validated.
For example, to check that you can restore the
database cluster from a backup copy up to the specified
xid transaction ID, run this command:
pg_probackup validate -Bbackupdir--instanceinstance_name--xid=xid
If validation completes successfully, pg_probackup displays the corresponding message. If validation fails, you will receive an error message with the exact time and transaction ID up to which the recovery is possible.
To restore the database cluster from a backup, use the restore command:
pg_probackup restore -Bbackupdir--instanceinstance_name-Ddatadir-ibackup_id
where:
datadir specifies the location of the
restored data directory of the cluster. If you omit the
-D option, the datadir value is taken from
the pg_probackup.conf configuration file, so
the cluster will be restored in its original location.
backup_id specifies the backup to restore the
cluster from. If you omit this option,
pg_probackup uses the latest
backup available for the specified instance.
If you specify an incremental backup to restore, pg_probackup automatically restores the underlying full backup and then sequentially applies all the necessary increments.
If you have configured archive backups, you can restore the cluster
to its state at an arbitrary point in time (recovery
target) using
recovery target
options. pg_probackup automatically
chooses the backup that is the closest to the specified recovery
target and start the recovery process. By default, the
recovery_target_inclusive parameter defines whether the recovery
target is included into the backup. You can explicitly include or
exclude the recovery target using the
--inclusive= option.
boolean
To restore the cluster state at the exact time, specify the
time option, in the timestamp
format. For example:
pg_probackup restore -Bbackupdir--instanceinstance_name--time='2017-05-18 14:18:11'
To restore the cluster state up to a specific transaction ID, use the xid:
pg_probackup restore -Bbackupdir--instanceinstance_name--xid=687
If the cluster to restore contains tablespaces,
pg_probackup restores them to their original
location by default. To restore tablespaces to a different location, use the
--tablespace-mapping option.
Otherwise, restoring the cluster on the same host will fail if tablespaces are in use, because the backup
would have to be written to the same directories.
When using the --tablespace-mapping you must
provide absolute paths to the old and new tablespace directories.
If a path happens to contain an equals sign (=), escape it with a
backslash. This option can be specified multiple times for multiple
tablespaces. For example:
pg_probackup restore -Bbackupdir--instanceinstance_name-Ddatadir-j 4 -ibackup_id-Ttablespace1_dir=tablespace1_newdir-Ttablespace2_dir=tablespace2_newdir
Once the restore command is complete, start the
database service. Postgres Pro will restore a self-consistent
state by replaying WAL files and will be ready to accept
connections.
Backup, recovery, and validation processes can be executed on several parallel threads. This can significantly speed up pg_probackup operation given enough resources (CPU cores, disk, and network throughput).
Parallel execution is controlled by the -j/--threads
command line option. For example, to create a backup using four parallel threads, run:
pg_probackup backup -Bbackupdir--instanceinstance_name-b FULL -j 4
Parallel recovery applies only to copying data from the backup catalog to the data directory of the cluster. When Postgres Pro server is started, WAL records need to be replayed, and this cannot be done in parallel.
Once the backup catalog is initialized and a new backup instance is added,
you can use the pg_probackup.conf configuration file located
in the backups/ directory
to fine-tune pg_probackup configuration.
instance_name
Initially, pg_probackup.conf contains the following settings:
PGDATA — the path to the data directory of
the cluster to back up.
system-identifier — the unique
identifier of the
Postgres Pro instance.
Additionally, you can define
connection,
retention,
logging, and
replica settings using the
set-config command:
pg_probackup set-config -Bbackupdir--instanceinstance_name[connection_options] [retention_options] [logging_options] [replica_options]
To view the current settings, run the following command:
pg_probackup show-config -Bbackupdir--instanceinstance_name
If you define connection settings in the pg_probackup.conf configuration file, you can omit connection options in all the subsequent pg_probackup commands. However, if the corresponding environment variables are set, they get higher priority. The options provided on the command line overwrite both environment variables and configuration file settings.
If nothing is given, the default values are taken. pg_probackup
tries to use local connection and tries to get the database name and the user name from the PGUSER environment variable or the current OS user name.
By default, all backup copies created with pg_probackup are stored in the specified backup catalog. To save disk space, you can configure retention policy and periodically clean up redundant backup copies accordingly.
To configure retention policy, set one or more of the following
variables in the pg_probackup.conf file:
retention-redundancy — specifies the number of full backup
copies to keep in the backup catalog.
retention-window — defines the earliest point in time for which pg_probackup can complete the recovery. This option is set in the number of days from the current moment. For example, if retention-window=7, pg_probackup must keep at least one full backup copy that is older than seven days, with all the corresponding WAL files.
If both retention-redundancy and retention-window
options are set, pg_probackup keeps backup copies
that satisfy both conditions. For example, if you set retention-redundancy=2 and retention-window=7, pg_probackup cleans up the backup directory to keep only two full backup copies if at least one of them is older than seven days.
To clean up the backup catalog in accordance with retention policy, run:
pg_probackup delete -Bbackupdir--instanceinstance_name--expired
pg_probackup deletes all backup copies that do not conform to the defined retention policy.
Alternatively, you can configure backup retention policy
and use the --delete-expired
option together with the backup command to
remove the outdated backup copies once the new backup is created.
With pg_probackup, you can manage backups from the command line:
To view the list of existing backups, run the command:
pg_probackup show -B backupdir
pg_probackup displays the list of all the available backups. You will see the following output:
BACKUP INSTANCE 'node' ============================================================================================================================================ Instance Version ID Recovery time Mode WAL Current/Parent TLI Time Data Start LSN Stop LSN Status ============================================================================================================================================ node 10 P7XDQV 2018-04-29 05:32:59+03 DELTA STREAM 1 / 0 11s 19MB 0/15000060 0/15000198 OK node 10 P7XDJA 2018-04-29 05:28:36+03 PTRACK STREAM 1 / 0 21s 32MB 0/13000028 0/13000198 OK node 10 P7XDHU 2018-04-29 05:27:59+03 PTRACK STREAM 1 / 0 31s 33MB 0/11000028 0/110001D0 OK node 10 P7XDHB 2018-04-29 05:27:15+03 FULL STREAM 1 / 0 11s 39MB 0/F000028 0/F000198 OK node 10 P7XDFT 2018-04-29 05:26:25+03 PTRACK STREAM 1 / 0 11s 40MB 0/D000028 0/D000198 OK
For each backup, the following information is provided:
Instance — the instance name.
Version — Postgres Pro version.
ID — the backup identifier.
Recovery time — the earliest moment for which you can restore the state of the database cluster.
Mode — the method used to take this backup.
Possible values: FULL, PAGE,
PTRACK.
WAL — the way of WAL log handling.
Possible values: STREAM for autonomous backups
and ARCHIVE for archive backups.
Current/Parent TLI — current and parent timelines of the database cluster.
Time — the time it took to perform the backup.
Data — the size of the data files in this backup. This value does not include the size of WAL files.
Start LSN — WAL log sequence number corresponding to the start of the backup process.
Stop LSN — WAL log sequence number corresponding to the end of the backup process.
Status — backup status. Possible values:
OK — the backup is complete and valid.
CORRUPT — some of the backup files are corrupted.
DONE — the backup is complete, but was not validated.
ERROR — the backup was aborted because of an unexpected error.
RUNNING — the backup is in progress.
DELETING — the backup files are being deleted.
You can restore the cluster from the backup only if the backup status is OK.
To get more detailed information about the backup, run the show with the backup ID:
pg_probackup show -Bbackupdir--instanceinstance_name-ibackup_id
The sample output is as follows:
#Configuration backup-mode = FULL stream = false #Compatibility block-size = 8192 xlog-block-size = 8192 checksum-version = 0 #Result backup info timelineid = 1 start-lsn = 0/04000028 stop-lsn = 0/040000f8 start-time = '2017-05-16 12:57:29' end-time = '2017-05-16 12:57:31' recovery-xid = 597 recovery-time = '2017-05-16 12:57:31' data-bytes = 22288792 status = OK
To delete a backup that is no longer required, run the following command:
pg_probackup delete -Bbackupdir--instanceinstance_name-ibackup_id
This command will delete the backup with the specified
backup_id, together with all the incremental backups
that followed, if any. This way, you can delete
some recent incremental backups, retaining the underlying full
backup and some of the incremental backups that follow it.
In this case, the next PTRACK backup will be incomplete as some
changes since the last retained backup will be lost. Either a full
backup or an incremental PAGE backup
(if all the necessary WAL
files are still present in the archive) must be taken then.
To delete obsolete WAL files that are not necessary to restore any
of the remaining backups, use the --wal option:
pg_probackup delete -Bbackupdir--instanceinstance_name--wal
To delete backups that are expired according to the current
retention policy, use the
--expired option:
pg_probackup delete -Bbackupdir--instanceinstance_name--expired
Postgres Professional, Moscow, Russia.
pg_probackup utility is based on pg_arman, that was originally written by NTT and then developed and maintained by Michael Paquier.