shardman-ladle — deployment tool for Shardman
shardman-ladle [common_options] init [ -f | --spec-file spec_file_name] |
spec_text
shardman-ladle [common_options] addnodes -n | --nodes nodes_names [--no-rebalance]
shardman-ladle [common_options] cleanup [ -p | --processrepgroups ]
shardman-ladle [common_options] rmnodes -n | --nodes nodes_names
shardman-ladle [common_options] status [ -f | --format text | json ]
shardman-ladle [common_options] backup
--datadir
[
directory
--maxtasks
]number_of_tasks
shardman-ladle [common_options] recover [--info ] [file--dumpfile ] [file--metadata-only] [--timeout ]seconds
Here common_options are:
[--cluster-name cluster_name] [--log-level error | warn | info | debug ] [--retries ] [retries_number--session-timeout ] [seconds--store-endpoints ] [store_endpoints--store-ca-file ] [store_ca_file--store-cert-file ] [store_cert_file--store-key ] [client_private_key--store-timeout ] [duration--version] [ -h | --help ]
shardman-ladle is a utility to initialize a Shardman cluster, add or remove nodes from the cluster, perform cleanup after unsuccessful operations or display the status of a cluster. Details of these operations are explained below.
To register a Shardman cluster in the etcd store, run the following command:
shardman-ladle [common_options] init [-f|--spec-filespec_file_name]|spec_text
You must provide the string with the cluster specification. You can do it as follows:
On the command line — do not specify the -f option and pass the string in spec_text.
On the standard input — specify the -f option and pass “-” in spec_file_name.
In a file — specify the -f option and pass the filename in spec_file_name.
In the init mode, shardman-ladle reads
the cluster specification, processes it and saves to the etcd store
as parts of two JSON documents: ClusterSpec — as
part of shardman/cluster0/clusterdata and
LadleSpec — as part of
shardman/cluster0/ladledata (cluster0
is the default cluster name used by Shardman utilities).
Common options related to the etcd store, such as --store-endpoints,
are also saved to the etcd store and pushed down to all
Shardman services started by
shardman-bowl. See
sdmspec.json for the description of the
Shardman initialization file format.
To add nodes to a Shardman cluster, run the following command:
shardman-ladle [common_options] addnotes -n|--nodes nodes_names [--no-rebalance]
You must specify the -n (--nodes) option to pass the
comma-separated list of nodes to be added.
Since all nodes are referred by their hostnames, these hostnames must be
correctly resolved on all nodes.
With the default clover placement policy,
nodes are added to a cluster by clovers.
Each node
in a clover runs the primary DBMS instance and perhaps several
replicas of other nodes in the clover. The number of replicas
is determined by the Repfactor configuration parameter.
So, each clover consists of Repfactor + 1 nodes
and can stand loss of Repfactor nodes.
shardman-ladle performs the addnodes operation
in several steps. The command:
Takes a global metadata lock.
For each specified node, checks that shardman-bowl is running on it and that it sees the current cluster configuration.
Calculates the services
to be present on each node and saves this information in etcd as part
of the shardman/cluster0/ladledata Layout
object.
Generates the configuration for new Stolon clusters (also called replication groups) and initializes them.
Waits for shardman-bowl to start all the necessary services, checks that new replication groups are accessible and have correct configuration.
For each new replication group in the cluster, but the first one,
copies the schema from a random existing
replication group to the new one; ensures that
the Shardman extension is installed on the new
replication group and recalculates OIDs used in the extension configuration tables.
On each existing replication group, defines foreign servers referencing the new replication group and recreates definitions of foreign servers on the new replication group.
Recreates all partitions of sharded tables and all global tables as foreign tables referencing data from old replication groups and has the changes registered in the etcd storage.
Rebalances partitions of sharded tables.
The data for these partitions
is transferred from existing nodes using logical replication. When the data
is in place, the foreign table corresponding to the partition is replaced with
a regular table and all foreign tables referencing the data in the original
replication group are modified to reference the new one, the old partition being
also replaced by the foreign table.
Use the --no-rebalance option to skip this step.
Registers the added replication groups in
shardman/cluster0/ladledata.
If addnodes command fails during
execution, use the cleanup command
to fix possible
cluster configuration issues.
To perform cleanup after
failure of the addnodes command or shardmanctl
rebalance command,
run the following command:
shardman-ladle [common_options] cleanup [-p|--processrepgroups]
Final changes to the etcd store are done
at the end of the command execution. This simplifies the cleanup process.
During cleanup, incomplete clover definitions and definitions of the corresponding replication
groups are removed from the etcd metadata. Definitions of the corresponding foreign servers
are removed from the DBMS metadata of the remaining replication
groups. Since the cleanup process can be destructive, by default
it operates in the report-only mode: the tool only shows actions to be done
during the actual cleanup. To execute the actual cleanup, use the -p
(--processrepgroups) option.
To remove nodes from a Shardman cluster, run the following command:
shardman-ladle [common_options] rmnodes -n|--nodes nodes_names
Use the -n (--nodes) option to pass the
comma-separated list of nodes to be removed.
This command removes clovers containing the specified nodes from the cluster.
The last clover in the cluster cannot be removed. If the cluster has any global tables
whose one of the main replication groups belongs to a
clover, such a clover cannot be removed either.
Any data (such as partitions of sharded relations) on removed
replication groups is migrated to the remaining replication groups using
logical replication, and all references to the removed replication groups
(including definitions of foreign servers) are removed from the metadata
of the remaining
replication groups. Finally, the metadata in etcd is updated.
Do not use the cleanup command to fix possible cluster
configuration issues after a failure of rmnodes.
Redo the rmnodes command instead.
To remove a replication group containing global table data,
first, disassemble the global table with the
shardman.make_table_local()
extension function.
To remove all nodes in a cluster and not care about the
data, just reinitialize the cluster. If a removed replication group contains
local (non-sharded and non-global) tables, the data is silently
lost after the replication group removal.
To display the health status of Shardman cluster subsystems, run the following command:
shardman-ladle [common_options] status [-f|--format text|json]
To get the report
in plain-text or JSON format, pass the value of plain or json
through the -f (--format) option. Plain-text format
is used by default.
The command checks the availability of all etcd cluster nodes, consistency of
metadata stored in etcd, correctness of replication group definitions,
availability of shardman-bowl daemons and of all
DBMS instances in the cluster.
Each detected issue is reported as an unknown status, warning, error or fatal
error. The tool can also report an operational error, which means that there was
an issue during the cluster health check. When the command encounters a fatal or
operational error, it stops further diagnostics. An error is considered fatal
if it impacts higher-level subsystems. For example,
an inconsistency in etcd
metadata does not allow correct cluster operations and must be handled first,
so there is no point in further diagnostics.
To backup a Shardman cluster, run the following command:
shardman-ladle [common_options] backup --datadirdirectory[--maxtasksnumber_of_tasks]
Use the following options:
--datadir directory
Specifies the directory to write the output to. If the directory exists, it must be empty. If it does not exist, shardman-ladle creates it (but not parent directories).
This option is required.
--maxtasks number_of_tasks
Specifies the maximum number of concurrent tasks (pg_receivewal or
pg_basebackup commands) to run.
The value of 0 (default) means no restriction.
A backup consists of a directory with base backups of all replication groups
and WAL files needed for recovery. etcd metadata is saved
to the etcd_dump file. The
backup_info file is created during a backup and contains the backup
description.
For details of the backup command logic, see Cluster Backup Process.
To restore a Shardman cluster from a backup created by
the backup command,
run the following command:
shardman-ladle [common_options] recover [--infofile] [--dumpfilefile] [--metadata-only] [--timeoutseconds]
Use the following options:
--dumpfile file
Specifies the file to load the etcd metadata dump from.
This option is required for metadata-only recovery.
--info file
Specifies the file to load information about the backup from.
In most cases, set this option to point to
the backup_info file in
the backup directory or to its modified copy.
This option is required for full recovery.
--metadata-onlyPerform metadata-only recovery. If not specified, full recovery is performed.
--timeout seconds
Exit with error after waiting for the cluster to be ready or the recovery to complete for the specified number of seconds
Before running the recover command, specify
DataRestoreCommand and RestoreCommand
in the backup_info file. DataRestoreCommand fetches the base backup
and restores it to the Stolon data directory. RestoreCommand
fetches the WAL file and saves it to Stolon pg_wal directory.
These commands can use the following substitutions:
%pDestination path on the server.
%s
SystemId of the restored database (the
same in the backup and in restored cluster).
%fName of the WAL file to restore.
stolon-keeper runs both commands on each node in the cluster. Therefore:
Make the backup accessible to these nodes (for example, by storing it in a shared filesystem or by using a remote copy protocol, such as SFTP).
Commands to fetch the backup are executed as the operating system user
under which Stolon daemons work (usually postgres),
so set the permissions for the backup files appropriately.
These examples show how to specify
RestoreCommand and DataRestoreCommand:
If a backup is available through a passwordless SCP, you can use:
"DataRestoreCommand": "scp -r user@host:/var/backup/shardman/%s/backup/* %p", "RestoreCommand": "scp user@host:/var/backup/shardman/%s/wal/%f %p"
If a backup is stored on NFS and available through
/var/backup/shardman path, you can use:
"DataRestoreCommand": "cp -r /var/backup/shardman/%s/backup/* %p", "RestoreCommand": "cp /var/backup/shardman/%s/wal/%f %p"
For details of the recover command logic, see
Recovery from Shardman Backups.
shardman-ladle common options are optional parameters
that are not specific to the utility. They specify
etcd connection settings, cluster name and a few more settings.
By default shardman-ladle tries to connect to
the etcd store 127.0.0.1:2379 and use the cluster0
cluster name. The default log level is info.
-h, --help
Show brief usage information
--cluster-name cluster_name
Specifies the name for a cluster to operate on.
The default is cluster0.
--log-level level
Specifies the log verbosity. Possible values of
level are (from minimum to maximum):
error,
warn, info and
debug. The default is info.
--retries number
Specifies how many times shardman-ladle retries a failing etcd request. If an etcd request fails, most likely, due to a connectivity issue, shardman-ladle retries it the specified number of times before reporting an error. The default is 5.
--session-timeout seconds
Specifies the session timeout for shardman-ladle locks. If there is no connectivity between shardman-ladle and the etcd store for the specified number of seconds, the lock is released. The default is 30.
--store-endpoints string
Specifies the etcd address in the format:
http[s]://address[:port](,http[s]://address[:port])*.
The default is http://127.0.0.1:2379.
--store-ca-file string
Verify the certificate of the HTTPS-enabled etcd store server using this CA bundle
--store-cert-file string
Specifies the certificate file for client identification by the etcd store
--store-key string
Specifies the private key file for client identification by the etcd store
--store-timeout duration
Specifies the timeout for a etcd request. The default is 5 seconds.
--version
Show shardman-utils version information
SDM_CLUSTER_NAME
An alternative to setting the --cluster-name
option
SDM_LOG_LEVEL
An alternative to setting the --log-level
option
SDM_NODES
An alternative to setting the --nodes
option for addnodes and rmnodes
SDM_RETRIES
An alternative to setting the --retries
option
SDM_SPEC_FILE
An alternative to setting the --spec-file
option for init
SDM_STORE_ENDPOINTS
An alternative to setting the --store-endpoints
option
SDM_STORE_CA_FILE
An alternative to setting the --store-ca-file
option
SDM_STORE_CERT_FILE
An alternative to setting the --store-cert-file
option
SDM_STORE_KEY
An alternative to setting the --store-key
option
SDM_STORE_TIMEOUT
An alternative to setting the --store-timeout
option
SDM_SESSION_TIMEOUT
An alternative to setting the --session-timeout
option
To initialize a Shardman cluster that
has the cluster0 name, uses an etcd cluster consisting
of n1,n2 and n3
nodes listening on port 2379, ensure proper settings in the spec file
sdmspec.json and run:
$shardman-ladle --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 init -f sdmspec.json
To add n1,n2,
n3 and n4 nodes
to the cluster, run:
$shardman-ladle --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 addnodes -n n1,n2,n3,n4
The number of nodes being added must be a multiple of Repfactor + 1.
To remove n1 and n2
nodes, along with clovers that contain them, from the cluster0
cluster, run:
$shardman-ladle --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 rmnodes -n n1,n2
Here is a sample status output from shardman-ladle:
$shardman-ladle --store-endpoints http://n1:2379,http://n2:2379,http://n3:2379 status=== Store status === STATUS MESSAGE REPLICATION GROUP NODE OK Store is OK === Metadata status === STATUS MESSAGE REPLICATION GROUP NODE OK Metadata is OK === Bowls status === STATUS MESSAGE REPLICATION GROUP NODE OK Bowl on node n1 is OK n1 OK Bowl on node n2 is OK n2 OK Bowl on node n3 is OK n3 OK Bowl on node n4 is OK n4 === Replication Groups status === STATUS MESSAGE REPLICATION GROUP NODE OK Replication group clover-1-n1 clover-1-n1 is OK OK Replication group clover-1-n2 clover-1-n2 is OK OK Replication group clover-2-n3 clover-2-n3 is OK OK Replication group clover-2-n4 clover-2-n4 is OK === Dictionary status === STATUS MESSAGE REPLICATION GROUP NODE OK Replication group clover-1-n1 clover-1-n1 dictionary is OK OK Replication group clover-1-n2 clover-1-n2 dictionary is OK OK Replication group clover-2-n3 clover-2-n3 dictionary is OK OK Replication group clover-2-n4 clover-2-n4 dictionary is OK
To create a backup of the cluster0 cluster
using etcd at etcdserver listening on port
2379 and store it in the local directory
/var/backup/shardman, run:
$shardman-ladle --store-endpoints http://etcdserver:2379 backup --datadir=/var/backup/shardman
Assume that you are performing a recovery from a backup to the
cluster0 cluster using etcd at etcdserver
listening on port 2379 and you take the backup description from
the /var/backup/shardman/backup_info file. Edit
the /var/backup/shardman/backup_info file, set
DataRestoreCommand, RestoreCommand as
necessary and run:
$shardman-ladle --store-endpoints http://etcdserver:2379 recover --info /var/backup/shardman/backup_info
For metadata-only recovery, run:
$shardman-ladle --store-endpoints http://etcdserver:2379 recover --metadata-only --dumpfile /var/backup/shardman/etcd_dump