shardmanctl — Shardman auxiliary command-line client and deployment tool.
shardmanctl [common_options] backup
--datadir
[
directory
--maxtasks
]number_of_tasks
shardmanctl [common_options] cleanup [ -p | --processrepgroups ] --after-node-operation --after-rebalance
shardmanctl [common_options] forall
--sql
[
query--twophase
]
shardmanctl [common_options] getconnstr [--direct-masters]
shardmanctl [common_options] init [ -f | --spec-file spec_file_name] |
spec_text
shardmanctl [common_options] nodes add -n | --nodes node_names [--no-rebalance]
shardmanctl [common_options] nodes rm -n | --nodes node_names
shardmanctl [common_options] probackup [ init | archive-command | backup | restore | show | validate ] [subcommand options]
shardmanctl [common_options] rebalance
shardmanctl [common_options] recover [--info ] [file--dumpfile ] [file--metadata-only] [--timeout ]seconds
shardmanctl [common_options] status [ -f | --format text | json ]
shardmanctl [common_options] update [[ -f | --file stolon_spec_file] |
[ spec_text
-p | --patch ] [ -w | --wait ]]
Here common_options are:
[--cluster-name cluster_name] [--log-level error | warn | info | debug ] [--retries ] [retries_number--session-timeout ] [seconds--store-endpoints ] [store_endpoints--store-ca-file ] [store_ca_file--store-cert-file ] [store_cert_file--store-key ] [client_private_key--store-timeout ] [duration--version] [ -h | --help ]
shardmanctl is an utility for managing a Shardman cluster.
The backup command is used to backup a
Shardman cluster.
A backup consists of a directory with base backups of all replication groups
and WAL files needed for recovery. etcd metadata is saved
to the etcd_dump file. The
backup_info file is created during a backup and contains the backup
description.
For details of the backup command logic, see Cluster backup with pg_basebackup.
For usage details of the command, see
the section called “Backing up a Shardman Cluster”.
The cleanup command is used for cleanup after
failure of the nodes add command or of the shardmanctl
rebalance command.
Final changes to the etcd store are done
at the end of the command execution. This simplifies the cleanup process.
During cleanup, incomplete clover definitions and definitions of the corresponding replication
groups are removed from the etcd metadata. Definitions of the corresponding
foreign servers are removed from the DBMS metadata of the remaining replication
groups. Since the cleanup process can be destructive, by default,
the tool operates in the report-only mode: it only shows actions to be done
during the actual cleanup.
For usage details of the command, see
the section called “Performing Cleanup”.
The init command is used to register a new
Shardman cluster in the etcd store.
In the init mode, shardmanctl reads
the cluster specification, processes it and saves to the etcd store
as parts of two JSON documents: ClusterSpec — as
part of shardman/cluster0/clusterdata and
LadleSpec — as part of
shardman/cluster0/ladledata (cluster0
is the default cluster name used by Shardman utilities).
Common options related to the etcd store, such as --store-endpoints,
are also saved to the etcd store and pushed down to all
Shardman services started by
shardmand. For the description of the
Shardman initialization file format, see
sdmspec.json. For usage details of the command, see
the section called “Registering a Shardman Cluster”.
The forall command is used to execute an SQL statement on all
replication groups in a Shardman cluster.
The getconnstr command is used to get the libpq connection
string for connecting to a cluster as administrator.
The nodes add command is used to add new
nodes to a Shardman cluster.
With the default clover placement policy,
nodes are added to a cluster by clovers.
Each node
in a clover runs the primary DBMS instance and perhaps several
replicas of other nodes in the clover. The number of replicas
is determined by the Repfactor configuration parameter.
So, each clover consists of Repfactor + 1 nodes
and can stand loss of Repfactor nodes.
shardmanctl performs the nodes add operation
in several steps. The command:
Takes a global metadata lock.
For each specified node, checks that shardmand is running on it and that it sees the current cluster configuration.
Calculates the services
to be present on each node and saves this information in etcd as part
of the shardman/cluster0/ladledata Layout
object.
Generates the configuration for new stolon clusters (also called replication groups) and initializes them.
Waits for shardmand to start all the necessary services, checks that new replication groups are accessible and have correct configuration.
For each new replication group in the cluster, but the first one,
copies the schema from a random existing
replication group to the new one; ensures that
the Shardman extension is installed on the new
replication group and recalculates OIDs used in the extension configuration tables.
On each existing replication group, defines foreign servers referencing the new replication group and recreates definitions of foreign servers on the new replication group.
Recreates all partitions of sharded tables and all global tables as foreign tables referencing data from old replication groups and has the changes registered in the etcd storage.
Rebalances partitions of sharded tables.
The data for these partitions
is transferred from existing nodes using logical replication. When the data
is in place, the foreign table corresponding to the partition is replaced with
a regular table and all foreign tables referencing the data in the original
replication group are modified to reference the new one, the old partition being
also replaced by the foreign table.
You can skip this step using the --no-rebalance.
Registers the added replication groups in
shardman/cluster0/ladledata.
For usage details of the command, see the section called “Adding Nodes to a Shardman Cluster”.
The nodes rm command is used to remove nodes from
a Shardman cluster.
This command removes clovers containing the specified nodes from the cluster.
The last clover in the cluster cannot be removed.
Any data (such as partitions of sharded relations) on removed
replication groups is migrated to the remaining replication groups using
logical replication, and all references to the removed replication groups
(including definitions of foreign servers) are removed from the metadata
of the remaining
replication groups. Finally, the metadata in etcd is updated.
For usage details of the command, see
the section called “Removing Nodes from a Shardman cluster”.
The probackup command is used to backup and restore the
Shardman cluster using pg_probackup
backup utility. For details of the probackup command logic, see
Backup anf Recovery Shardman Backups using pg_probackup.
For usage details of the command, see the section called “probackup”.
The rebalance command is used to evenly rebalance sharded tables in a cluster.
This can be useful, for example, if you did not perform rebalance when adding nodes to the cluster.
The cleanup command with flag --after-rebalance is used to perform cleanup after failure of a rebalance
command. On each node, it cleans up
subscriptions and publications left from the rebalance command and
drops tables that store data of partially-transferred partitions of sharded tables.
The recover command is used to restore
a Shardman cluster from a backup created by
the backup command.
For details of the recover command logic, see
Cluster recovery from a backup using pg_basebackup.
For usage details of the command, see
the section called “Restoring a Shardman Cluster”.
The status command is used to display health status
of Shardman cluster subsystems.
The command checks the availability of all etcd cluster nodes, consistency of
metadata stored in etcd, correctness of replication group definitions,
availability of shardmand daemons and of all
DBMS instances in the cluster. The issues are reported in plain-text or JSON format.
For usage details of the command, see
the section called “Getting the Health Status of Cluster Subsystems”.
The update command is used to update the stolon configuration.
The new configuration is applied to all replication groups and is saved in
clusterdata, so that new replication groups are
initialized with this configuration. Note that update
can cause a DBMS restart.
All the described shardmanctl commands take a global metadata lock.
This section describes shardmanctl
commands.
For Shardman common options used by the commands,
see the section called “Common Options”.
backupSyntax:
shardmanctl [common_options] backup --datadir directory [--maxtasks number_of_tasks]
Backs up a Shardman cluster.
--datadir directory
Required.
Specifies the directory to write the output to. If the directory exists, it must be empty. If it does not exist, shardmanctl creates it (but not parent directories).
--maxtasks number_of_tasks
Specifies the maximum number of concurrent tasks (pg_receivewal or
pg_basebackup commands) to run.
Default: 0 - no restriction.
For more details, see the section called “Backing up a Shardman Cluster”
cleanupSyntax:
shardmanctl [common_options] cleanup [-p|--processrepgroups] --after-node-operation|--after-rebalance
Performs cleanup after
of the nodes add command or of the shardmanctl
rebalance command.
-p node_names--processrepgroups=node_namesExecute actual cleanup. By default, the tool only shows actions to be done during the actual cleanup. For more details, see the section called “Performing Cleanup”.
--after-node-operation
The cleanup command with flag --after-node-operation is used to perform cleanup after failure of a nodes add
command.
--after-rebalance
The cleanup command with flag --after-rebalance is used to perform cleanup after failure of a rebalance
command.
forallSyntax:
shardmanctl [common_options] forall --sql query [--twophase]
Executes an SQL statement on all replication groups in a Shardman cluster
--sql query
Specifies the statement to be executed
--twophase
Use the two-phase-commit protocol to execute the statement
getconnstrSyntax:
shardmanctl [common_options] getconnstr
Gets the libpq connection string for connecting to a cluster as administrator.
initSyntax:
shardmanctl [common_options] init [-f|--spec-file spec_file_name]|spec_text
Registers a Shardman cluster in the etcd store.
-f spec_file_name--specfile=spec_file_name
Specifies the file with the cluster specification string. The value of “-”
means the standard input.
By default, the string is passed in spec_text.
For usage details, see the section called “Registering a Shardman Cluster”.
nodes addSyntax:
shardmanctl [common_options] nodes add -n|--nodes node_names [--no-rebalance]
Adds nodes to a Shardman cluster.
-n node_names--nodes=node_namesRequired.
Specifies the comma-separated list of nodes to be added.
--no-rebalanceSkip the step of rebalancing partitions of sharded tables. For more details, see the section called “Adding Nodes to a Shardman Cluster”.
nodes rmSyntax:
shardmanctl [common_options] nodes rm -n|--nodes node_names
Removes nodes from a Shardman cluster.
-n node_names--nodes=node_namesSpecifies the comma-separated list of nodes to be removed. For usage details, see the section called “Removing Nodes from a Shardman cluster”.
probackupSyntax:
shardmanctl [common_options] probackup
[init|archive-command|backup|restore|show|validate]
[--help]
[subcommand options]
Creates a backup of the Shardman cluster and restore Shardman cluster form a backup using pg_probackup.
Subcommands list:initInitializes a new repository folder for the Shardman cluster backup.
archive-commandAdds and enables/disables archive_command to each replication group in the Shardman cluster.
backupCreates a backup of the Shardman cluster.
restoreRestores Shardman cluster from the selected backup.
showShow backups list of the Shardman cluster.
validateChecks the selected Shardman cluster backup for integrity.
--helpShows subcommand help.
initSyntax:
shardmanctl probackup init
-B|--backup-path path
-E|--etcd-path path
[--remote-port port]
[--remote-user username]
[--ssh-key path]
[-t|--timeout seconds]
[-m|--maxtasks number_of_tasks]
Initializes a new repository folder for the Shardman cluster backup.
-B path--backup-path pathRequired. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
-E path--etcd-path pathRequired. Specifies the path to the catalog where the etcd dumps should be stored.
--remote-port port
Specifies a remote ssh port for the replication groups instances. If not specified, default value - 22.
--remote-user username
Specifies remote ssh user for the replication groups instances. If not specified, default value - postgres.
archive-commandSyntax:
shardmanctl probackup archive-command [add|rm]
-B|--backup-path path
[--remote-port port]
[--remote-user username]
Adds/Remove and enables/disables archive command to every replication group in the
Shardman cluster to put WAL logs into the initialized backup repository.
addAdds and enables archive command to every replication group in the Shardman cluster.
rmDisables archive command from every replication group in the Shardman cluster. No additional parameters required.
-B path--backup-path pathRequired when adding archive_command. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
--remote-port port
Specifies a remote ssh port for the replication groups instances. If not specified, default value - 22.
--remote-user username
Specifies remote ssh user for the replication groups instances. If not specified, default value - postgres.
backupSyntax:
shardmanctl probackup backup -B|--backup-pathpath-E|--etcd-pathpath-b|--backup-modeMODE--compress --compress-algorithmalgorithm--compress-levellevel[--remote-portport] [--remote-userusername] [--ssh-keypath] [-t|--timeoutseconds] [-m|--maxtasksnumber_of_tasks]
Creates a backup of the Shardman cluster.
-B path--backup-path pathRequired. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
-E path--etcd-path pathRequired. Specifies the path to the catalog where the etcd dumps should be stored.
-b MODE--backup-mode MODERequired. Defines backup mode: FULL, PAGE, DELTA.
--compressEnables backup compression. If this flag is not specified, compression will be disabled If the flag is specified, the compression parameters below should be specified.
--compress-algorithm algorithm
Defines compression algorithm: zlib, pglz, default value - none.
--compress-level level
Defines compression level - 0-9, default value - 0.
--remote-port port
Specifies a remote ssh port for the replication groups instances. If not specified, default value - 22.
--remote-user username
Specifies remote ssh user for the replication groups instances. If not specified, default value - postgres.
--ssh-key path
Specifies the ssh private key for the remote ssh commands execution. If not specified, default value - $HOME/.ssh/id_rsa.
restoreSyntax:
shardmanctl probackup restore
-B|--backup-path path
-i|--backup-id id
[-t|--timeout seconds]
[-m|--maxtasks number_of_tasks]
Restores Shardman cluster from the selected backup.
-B path--backup-path pathRequired. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
-i id--backup-id idRequired. Specifies backup id for restore.
--metadata-onlyPerform metadata-only restore. By default, full restore is performed.
showSyntax:
shardmanctl probackup show
-B|--backup-path path
[--format format]
Show backups list of the Shardman cluster.
-B path--backup-path pathRequired. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
-f format--format format
Specifies output format: table or json. If not specified, default value - table.
validateSyntax:
shardmanctl probackup validate
-B|--backup-path path
-i|--backup-id id
[-t|--timeout seconds]
[-m|--maxtasks number_of_tasks]
Checks the selected Shardman cluster backup for integrity.
-B path--backup-path pathRequired. Specifies the path to the backup catalog where the Shardman cluster backups should be stored.
-i id--backup-id idRequired. Specifies backup id for validation.
rebalanceSyntax:
shardmanctl [common_options] rebalance
Rebalances sharded tables.
recoverSyntax:
shardmanctl [common_options] recover [--infofile] [--dumpfilefile] [--metadata-only] [--timeoutseconds]
Restores a Shardman cluster from a backup created by
the backup command.
--dumpfile file
Required for metadata-only restore.
Specifies the file to load the etcd metadata dump from.
--info file
Required for full restore.
Specifies the file to load information about the backup from.
--metadata-onlyPerform metadata-only restore. By default, full restore is performed.
--timeout seconds
Exit with error after waiting until the cluster is ready or the recovery is complete for the specified number of seconds
For more details, see the section called “Restoring a Shardman Cluster”
statusSyntax:
shardmanctl [common_options] status [-f|--format text|json]
Reports on the health status of Shardman cluster subsystems.
-f text|json--format=text|jsonSpecifies the report format: plain-text or JSON.
Default: text.
For more details, see the section called “Getting the Health Status of Cluster Subsystems”.
updateSyntax:
shardmanctl [common_options] update [[-f|--filestolon_spec_file]|spec_text[-p|--patch][-w|--wait]]
Updates the stolon configuration.
-f stolon_spec_file--specfile=stolon_spec_file
Specifies the file with the stolon configuration.
The value of “-” means the standard input.
By default, the configuration is passed in spec_text.
-w--waitSpecifies that shardmanctl should wait for configuration changes to take effect. If new configuration can not be loaded by all replication groups, shardmanctl will wait forever.
-p
--patch
Merge the new configuration into the existing one. By default, the new configuration replaces the existing one.
shardmanctl common options are optional parameters
that are not specific to the utility. They specify
etcd connection settings, cluster name and a few more settings.
By default shardmanctl tries to connect to
the etcd store 127.0.0.1:2379 and use the cluster0
cluster name. The default log level is info.
-h, --help
Show brief usage information
--cluster-name cluster_name
Specifies the name for a cluster to operate on.
The default is cluster0.
--log-level level
Specifies the log verbosity. Possible values of
level are (from minimum to maximum):
error,
warn, info and
debug. The default is info.
--retries number
Specifies how many times shardmanctl retries a failing etcd request. If an etcd request fails, most likely, due to a connectivity issue, shardmanctl retries it the specified number of times before reporting an error. The default is 5.
--session-timeout seconds
Specifies the session timeout for shardmanctl locks. If there is no connectivity between shardmanctl and the etcd store for the specified number of seconds, the lock is released. The default is 30.
--store-endpoints string
Specifies the etcd address in the format:
http[s]://address[:port](,http[s]://address[:port])*.
The default is http://127.0.0.1:2379.
--store-ca-file string
Verify the certificate of the HTTPS-enabled etcd store server using this CA bundle
--store-cert-file string
Specifies the certificate file for client identification by the etcd store
--store-key string
Specifies the private key file for client identification by the etcd store
--store-timeout duration
Specifies the timeout for a etcd request. The default is 5 seconds.
--version
Show shardman-utils version information
SDM_CLUSTER_NAME
An alternative to setting the --cluster-name
option
SDM_FILE
An alternative to setting the --file
option for update
SDM_LOG_LEVEL
An alternative to setting the --log-level
option
SDM_NODES
An alternative to setting the --nodes
option for nodes add and nodes rm
SDM_RETRIES
An alternative to setting the --retries
option
SDM_SPEC_FILE
An alternative to setting the --spec-file
option for init
SDM_STORE_ENDPOINTS
An alternative to setting the --store-endpoints
option
SDM_STORE_CA_FILE
An alternative to setting the --store-ca-file
option
SDM_STORE_CERT_FILE
An alternative to setting the --store-cert-file
option
SDM_STORE_KEY
An alternative to setting the --store-key
option
SDM_STORE_TIMEOUT
An alternative to setting the --store-timeout
option
SDM_SESSION_TIMEOUT
An alternative to setting the --session-timeout
option
To add nodes to a Shardman cluster, run the following command:
shardmanctl [common_options] nodes add -n|--nodes node_names
You must specify the -n (--nodes) option to pass the
comma-separated list of nodes to be added.
Since all nodes are referred by their hostnames, these hostnames must be
correctly resolved on all nodes.
If nodes add command fails during
execution, use the cleanup --after-node-operation command
to fix possible
cluster configuration issues.
By default, cleanup operates in the report-only mode, that is,
the following command will only show actions to be done
during actual cleanup:
shardmanctl [common_options] cleanup --after-node-operation|--after-rebalance
To perform the actual cleanup, run the following command:
shardmanctl [common_options] cleanup -p|--processrepgroups --after-node-operation|--after-rebalance
To remove nodes from a Shardman cluster, run the following command:
shardmanctl [common_options] nodes rm -n|--nodes node_names
Specify the -n (--nodes) option to pass the
comma-separated list of nodes to be removed.Recreates all partitions of sharded tables
Do not use the cleanup command to fix possible cluster
configuration issues after a failure of nodes rm.
Redo the nodes rm command instead.
To remove all nodes in a cluster and not care about the data, just reinitialize the cluster. If a removed replication group contains local (non-sharded and non-global) tables, the data is silently lost after the replication group removal.
To get a report on the health status of Shardman cluster subsystems in plain-text format, run the following command:
shardmanctl [common_options] status
To get the report in JSON format, pass the value of json
through the -f (--format) option.
Each detected issue is reported as an unknown status, warning, error or fatal
error. The tool can also report an operational error, which means that there was
an issue during the cluster health check. When the command encounters a fatal or
operational error, it stops further diagnostics. An error is considered fatal
if it impacts higher-level subsystems. For example,
an inconsistency in etcd
metadata does not allow correct cluster operations and must be handled first,
so there is no point in further diagnostics.
To backup a Shardman cluster, you can run the following command:
shardmanctl [common_options] backup --datadir directory
You must pass the directory to write the output to through the --datadir
option. You can limit the number of running concurrent tasks (pg_receivewal or
pg_basebackup commands) by passing the limit through the --maxtasks option.
To register a Shardman cluster in the etcd store, run the following command:
shardmanctl [common_options] init [-f|--spec-filespec_file_name]|spec_text
You must provide the string with the cluster specification. You can do it as follows:
On the command line — do not specify the -f option and pass the string in spec_text.
On the standard input — specify the -f option and pass “-” in spec_file_name.
In a file — specify the -f option and pass the filename in spec_file_name.
shardmanctl can perform either full restore
or metadata-only restore of a Shardman cluster from a backup created by
the backup command.
To perform full restore, you can run the following command:
shardmanctl [common_options] recover --info file
Pass the file to load information about the backup from through the --info
option. In most cases, set this option to point to
the backup_info file in
the backup directory or to its modified copy.
If you encounter issues with an etcd instance, it makes sense to perform metadata-only restore. To do this, you can run the following command:
shardmanctl [common_options] recover --dumpfile file --metadata-only
You must pass the file to load the etcd metadata dump from
through the --dumpfile option.
For both kinds of restore, you can specify --timeout for the tool to exit with error
after waiting until the cluster is ready or the recovery
is complete for the specified number of seconds.
Before running the recover command, specify
DataRestoreCommand and RestoreCommand
in the backup_info file. DataRestoreCommand fetches the base backup
and restores it to the stolon data directory. RestoreCommand
fetches the WAL file and saves it to stolon pg_wal directory.
These commands can use the following substitutions:
%pDestination path on the server.
%s
SystemId of the restored database (the
same in the backup and in restored cluster).
%fName of the WAL file to restore.
stolon-keeper runs both commands on each node in the cluster. Therefore:
Make the backup accessible to these nodes (for example, by storing it in a shared filesystem or by using a remote copy protocol, such as SFTP).
Commands to fetch the backup are executed as the operating system user
under which stolon daemons work (usually postgres),
so set the permissions for the backup files appropriately.
These examples show how to specify
RestoreCommand and DataRestoreCommand:
If a backup is available through a passwordless SCP, you can use:
"DataRestoreCommand": "scp -r user@host:/var/backup/shardman/%s/backup/* %p", "RestoreCommand": "scp user@host:/var/backup/shardman/%s/wal/%f %p"
If a backup is stored on NFS and available through
/var/backup/shardman path, you can use:
"DataRestoreCommand": "cp -r /var/backup/shardman/%s/backup/* %p", "RestoreCommand": "cp /var/backup/shardman/%s/wal/%f %p"
To backup a Shardman cluster, the following requirements must be met:
Shardman cluster configuration parameter enable_csn_snapshot must be on,
this parameter is necessary for the cluster backup to be consistent. If this parameter is disabled,
consistent backup is not possible;
On the backup host Shardman utilities must be installed into /opt/pgpro/sdm-14/bin;
On the backup host and on each cluster node pg_probackup must be installed into /opt/pgpro/sdm-14/bin;
On the backup host postgres Linux user and group must be created;
Passwordless ssh between backup host and each Shardman cluster node for the postgres Linux user must be configured;
Backup folder must be created;
Access for the postgres Linux user to the backup folder must be granted;
shardmanctl utility must be run under postgres Linux user;
Init subcommand for the backup repository initialization must be successfuly executed on the backup host;
Archive-command subcommand for the enabling archive_command for each replication group to stream WALs into inited repository must be successfuly executed on the backup host;
For example, on backup host:
groupadd postgres useradd -m -N -g postgres -r -d /var/lib/postgresql -s /bin/bash
Then add ssh keys to provide passwordless ssh access between backup host and Shardman cluster hosts.
Then on backup host:
apt-get install pg-probackup shardman-utils mkdir -pdirectorychown postgres:postgresdirectory-R shardmanctl [common_options] probackup init --backup-path=directory--etcd-path=directory/etcd--remote-user=postgres--remote-port=22shardmanctl [common_options] probackup archive-command --backup-path=directory--remote-user=postgres--remote-port=22
If the above requirements are met, then run backup subcommand for the cluster backup;
shardmanctl [common_options] probackup init --backup-path=directory--etcd-path=directory--backup-mode=MODE
You must pass the directories through the --backup-path and --etcd-path
options and backup mode through the --backup-mode. Full and delta backups are available
with FULL, DELTA, PAGE values.
Also it's possible to specify backup compression options throught --compress,
--compress-algorithm,--compress-level flags,
and --remote-port, --remote-user flags. You can limit the number of running
concurrent tasks when doing backup by passing the limit through the --maxtasks flag.
shardmanctl in probackup mode can perform either full restore
or metadata-only restore of a Shardman cluster from a backup created by
the probackup backup command.
To perform full or partial restore, firstly you must select needed backup to restore from. To show list of available backups run the following command:
shardmanctl [common_options] probackup show --backup-dir=path--format=format
The output should be a list of backups with it's ids in table or json format. Then pick the needed backup id and run the probackup restore command.
shardmanctl [common_options] probackup restore --backup-dir=path--backup-id=id
Pass the path to the repo through the --backup-dir
option and backup id througt --backup-id flag.
If you encounter issues with an etcd instance, it makes sense to perform metadata-only restore. To do this, you can run the following command:
shardmanctl [common_options] probackup restore --backup-dir=path--backup-id=id--metadata-only
For both kinds of restore, you can specify --timeout for the tool to exit with error
after waiting until the cluster is ready or the recovery
is complete for the specified number of seconds.
To initialize a Shardman cluster that
has the cluster0 name, uses an etcd cluster consisting
of n1,n2 and n3
nodes listening on port 2379, ensure proper settings in the spec file
sdmspec.json and run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 init -f sdmspec.json
To get the connection string for a Shardman cluster
that has the cluster0 name, uses an etcd cluster consisting
of n1,n2 and n3
nodes listening on port 2379, run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 getconnstrdbname=postgres host=n1,n4,n2,n1,n1,n2,n4,n3 password=yourpasswordhere port=5432,5433,5432,5433,5432,5433,5432,5433 user=postgres
Here is a sample status output from shardmanctl:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 status=== Store status === STATUS MESSAGE REPLICATION GROUP NODE OK Store is OK === Metadata status === STATUS MESSAGE REPLICATION GROUP NODE OK Metadata is OK === shardmand status === STATUS MESSAGE REPLICATION GROUP NODE OK shardmand on node n1 is OK n1 OK shardmand on node n2 is OK n2 OK shardmand on node n3 is OK n3 OK shardmand on node n4 is OK n4 === Replication Groups status === STATUS MESSAGE REPLICATION GROUP NODE OK Replication group clover-1-n1 clover-1-n1 is OK OK Replication group clover-1-n2 clover-1-n2 is OK OK Replication group clover-2-n3 clover-2-n3 is OK OK Replication group clover-2-n4 clover-2-n4 is OK === Dictionary status === STATUS MESSAGE REPLICATION GROUP NODE OK Replication group clover-1-n1 clover-1-n1 dictionary is OK OK Replication group clover-1-n2 clover-1-n2 dictionary is OK OK Replication group clover-2-n3 clover-2-n3 dictionary is OK OK Replication group clover-2-n4 clover-2-n4 dictionary is OK
To add n1,n2,
n3 and n4 nodes
to the cluster, run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 nodes add -n n1,n2,n3,n4
The number of nodes being added must be a multiple of Repfactor + 1.
To remove n1 and n2
nodes, along with clovers that contain them, from the cluster0
cluster, run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 nodes rm -n n1,n2
To execute the “select version()” query on all replication groups, run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 forall --sql 'select version()'Node 1 says: [PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit] Node 4 says: [PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit] Node 3 says: [PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit] Node 2 says: [PostgreSQL 13.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit]
To rebalance sharded tables in the cluster0 cluster, run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 rebalance
To set the max_connections parameter to 200 in the cluster,
create the spec file (for instance, ~/stolon.json) with the following contents:
{
"pgParameters": {
"max_connections": "200"
}
}
Then run:
$shardmanctl --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 update -p -f ~/stolon.json
Since changing max_connections requires
a restart, DBMS instances are restarted by this command.
To create a backup of the cluster0 cluster
using etcd at etcdserver listening on port
2379 and store it in the local directory
/var/backup/shardman, run:
$shardmanctl --store-endpoints http://etcdserver:2379 backup --datadir=/var/backup/shardman
Assume that you are performing a recovery from a backup to the
cluster0 cluster using etcd at etcdserver
listening on port 2379 and you take the backup description from
the /var/backup/shardman/backup_info file. Edit
the /var/backup/shardman/backup_info file, set
DataRestoreCommand, RestoreCommand as
necessary and run:
$shardmanctl --store-endpoints http://etcdserver:2379 recover --info /var/backup/shardman/backup_info
For metadata-only restore, run:
$shardmanctl --store-endpoints http://etcdserver:2379 recover --metadata-only --dumpfile /var/backup/shardman/etcd_dump
To create a backup of the cluster0 cluster
using etcd at etcdserver listening on port
2379 and store it in the local directory
/var/backup/shardman, first init the backups repository with the init subcommand:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup init --backup-path=/var/backup/shardman --etcd-path=/var/backup/etcd_dump
Then add and enable archive_command with the archive-command subcommand:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup archive-command add --backup-path=/var/backup/shardman
If the repository successfuly inited and archive_command successfuly added, then run the FULL backup creation with the backup subcommand:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup backup --backup-path=/var/backup/shardman --etcd-path=/var/backup/etcd_dump --backup-mode=FULL --compress --compress-algorithm=zlib --compress-level=5
For the DELTA or PAGE backup creation, run backup subcommand with DELTA or PAGE --backup-mode parameter:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup backup --backup-path=/var/backup/shardman --etcd-path=/var/backup/etcd_dump --backup-mode=DELTA --compress --compress-algorithm=zlib --compress-level=5
To show the created backup id, run show subcommand:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup show --backup-path=/var/backup/shardman --format=tableREPLICATION GROUP HOSTNAME BACKUPIDS BACKUP MODE LSNS BACKUP TIMESTAMP 7125062069167771757 n1 RFP1YS DELTA 0/250001B8 2022-08-27 19:14:06.259797832 +0000 UTC 7125062069167752779 n2 DELTA 0/250001A8 7125062069166812789 n3 DELTA 0/25000108 7125062069167512479 n4 DELTA 0/250001E8 7125062069167771757 n1 RFP1FI FULL 0/250000B8 2022-07-27 19:14:06.259797832 +0000 UTC 7125062069167752779 n2 FULL 0/250000A8 7125062069166812789 n3 FULL 0/25000008 7125062069167512479 n4 FULL 0/250000E8
To validate created backup, run validate subcommand:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup validate --backup-path=/var/backup/shardman --backup-id=RFP1FI
Assume that you are performing a recovery from a backup to the
cluster0 cluster using etcd at etcdserver
listening on port 2379 and you take the backup id from the show command:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup restore --backup-path=/var/backup/shardman --backup-id=RFP1FI
Finally we need to enable back archive_command.
$shardmanctl --store-endpoints http://etcdserver:2379 probackup archive-command add --backup-path=/var/backup/shardman
For metadata-only restore, run:
$shardmanctl --store-endpoints http://etcdserver:2379 probackup restore --metadata-only --backup-path=/var/backup/shardman --backup-id=RFP1FI