Assume that we have five nodes for deploying Postgres Pro Shardman. Let’s use the first three of them for the etcd three-node cluster and the other two nodes for the Postgres Pro Shardman two-node cluster.
Suppose that we have the following node names and IP addresses:
10.1.1.11 etcd1 - etcd node
10.1.1.12 etcd2 - etcd node
10.1.1.13 etcd3 - etcd node
10.1.1.20 sdm01 - Shardman node1
10.1.1.21 sdm02 - Shardman node2
Each node has 4Gb RAM, 20GB HDD, 2CPU and Ubuntu 22.04 installed.
/etc/hostsThis step must be performed on all nodes.
sudo /bin/sh -c 'cat << EOF >> /etc/hosts
10.1.1.11 etcd1
10.1.1.12 etcd2
10.1.1.13 etcd3
10.1.1.20 sdm01
10.1.1.21 sdm02
EOF'
This step must be performed on all nodes.
Deploy and start chrony daemon on all hosts.
sudo apt update
sudo apt install -y chrony
sudo systemctl enable --now chrony
Check that chrony is working properly.
chrony tracking
Expected output:
Reference ID : C0248F82 (Time100.Stupi.SE)
Stratum : 2
Ref time (UTC) : Tue Apr 18 11:50:44 2023
System time : 0.000019457 seconds slow of NTP time
Last offset : -0.000005579 seconds
RMS offset : 0.000089375 seconds
Frequency : 30.777 ppm fast
Residual freq : -0.000 ppm
Skew : 0.003 ppm
Root delay : 0.018349268 seconds
Root dispersion : 0.000334640 seconds
Update interval : 1039.1 seconds
Leap status : Normal
Set up the following packages on each node (etcd1, etcd2, etcd3):
sudo apt install -y vim wget curl iputils-ping
We use etcd API version 3, the Ubuntu 22.04 repository contains etcd, but an earlier version that is not good enough to make Shardman work.
Get latest etcd release from github, download, unpack and move to the /usr/local/bin folder.
export RELEASE=$(curl -s https://api.github.com/repos/etcd-io/etcd/releases/latest | grep tag_name | cut -d '"' -f 4)
wget https://github.com/etcd-io/etcd/releases/download/${RELEASE}/etcd-${RELEASE}-linux-amd64.tar.gz
tar xvf etcd-${RELEASE}-linux-amd64.tar.gz
cd etcd-${RELEASE}-linux-amd64
sudo mv etcd etcdctl etcdutl /usr/local/bin
sudo groupadd --system etcd
sudo useradd -s /sbin/nologin --system -g etcd etcd
sudo mkdir -p /var/lib/etcd/
sudo mkdir -p /etc/etcd
sudo chown -R etcd:etcd /var/lib/etcd /etc/etcd
Prepare systemd unit file for etcd.
Node etcd1:
sudo /bin/sh -c 'cat << EOF > /etc/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
--name=etcd1 \
--data-dir=/var/lib/etcd/ \
--initial-advertise-peer-urls=http://10.1.1.11:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--advertise-client-urls=http://10.1.1.11:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-cluster=etcd1=http://10.1.1.11:2380,etcd2=http://10.1.1.12:2380,etcd3=http://10.1.1.13:2380 \
--initial-cluster-token=secret-token \
--initial-cluster-state=new \
--max-snapshots=5 \
--max-wals=5 \
--quota-backend-bytes=6442450944 \
--auto-compaction-mode=revision \
--auto-compaction-retention=5
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
EOF'
Node etcd2:
sudo /bin/sh -c 'cat << EOF > /etc/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
--name=etcd2 \
--data-dir=/var/lib/etcd/ \
--initial-advertise-peer-urls=http://10.1.1.12:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--advertise-client-urls=http://10.1.1.12:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-cluster=etcd1=http://10.1.1.11:2380,etcd2=http://10.1.1.12:2380,etcd3=http://10.1.1.13:2380 \
--initial-cluster-token=secret-token \
--initial-cluster-state=new \
--max-snapshots=5 \
--max-wals=5 \
--quota-backend-bytes=6442450944 \
--auto-compaction-mode=revision \
--auto-compaction-retention=5
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
EOF'
Node etcd3:
sudo /bin/sh -c 'cat << EOF > /etc/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target
[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
--name=etcd3 \
--data-dir=/var/lib/etcd/ \
--initial-advertise-peer-urls=http://10.1.1.13:2380 \
--listen-peer-urls=http://0.0.0.0:2380 \
--advertise-client-urls=http://10.1.1.13:2379 \
--listen-client-urls=http://0.0.0.0:2379 \
--initial-cluster=etcd1=http://10.1.1.11:2380,etcd2=http://10.1.1.12:2380,etcd3=http://10.1.1.13:2380 \
--initial-cluster-token=secret-token \
--initial-cluster-state=new \
--max-snapshots=5 \
--max-wals=5 \
--quota-backend-bytes=6442450944 \
--auto-compaction-mode=revision \
--auto-compaction-retention=5
Restart=always
RestartSec=10s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
EOF'
Start the etcd cluster
On each cluster node:
sudo systemctl daemon-reload
sudo systemctl enable --now etcd
sudo systemctl status etcd
Check that the etcd cluster is working properly.
Run the following commands on any node.
ss -tunelp | grep 2379
Expected output:
tcp LISTEN 0 4096 *:2379 *:* uid:998 ino:25597 sk:1003 cgroup:/system.slice/etcd.service v6only:0 <->
etcdctl --endpoints http://10.1.1.11:2380,http://10.1.1.12:2380,http://10.1.1.13:2380 member list -w table
Expected output:
+------------------+---------+---------------+----------------------------+----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+---------------+----------------------------+----------------------------+------------+
| 4ea1b7505b50a919 | started | etcd1 | http://10.1.1.11:2380 | http://10.1.1.11:2379 | false |
| 709c7bac14ddefcc | started | etcd2 | http://10.1.1.12:2380 | http://10.1.1.12:2379 | false |
| bdb72e6a663cdd2a | started | etcd3 | http://10.1.1.13:2380 | http://10.1.1.13:2379 | false |
+------------------+---------+---------------+----------------------------+----------------------------+------------+
etcdctl --endpoints=http://10.1.1.11:2380,http://10.1.1.12:2380,http://10.1.1.13:2380 endpoint health
Expected output:
+----------------------------+--------+------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+----------------------------+--------+------------+-------+
| http://10.1.1.11:2379 | true | 2.045478ms | |
| http://10.1.1.12:2379 | true | 2.260022ms | |
| http://10.1.1.13:2379 | true | 2.167308ms | |
+----------------------------+--------+------------+-------+
etcdctl --endpoints=http://10.1.1.11:2380,http://10.1.1.12:2380,http://10.1.1.13:2380 endpoint status health
Expected output:
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://10.1.1.11:2379 | 4ea1b7505b50a919 | 3.5.8 | 799 kB | true | false | 14 | 794083 | 794083 | |
| http://10.1.1.12:2379 | bdb72e6a663cdd2a | 3.5.8 | 799 kB | false | false | 14 | 794083 | 794083 | |
| http://10.1.1.13:2379 | 709c7bac14ddefcc | 3.5.8 | 795 kB | false | false | 14 | 794083 | 794083 | |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
etcdctl put key test
OK
etcdctl get key
key
test
etcdctl del key
1
Three-node etcd cluster is ready to serve requests.
Let’s add a Postgres Pro public Shardman repository on each node:
sudo /bin/sh -c 'printf "deb [arch=amd64] http://repo.postgrespro.ru/pgprosm-14/ubuntu/ jammy main" > /etc/apt/sources.list.d/shardman.list'
sudo curl -fsSL http://repo.postgrespro.ru/pgprosm-14/keys/GPG-KEY-POSTGRESPRO | sudo apt-key add -
Next step is installation of packages (on each node):
sudo apt update
sudo apt install -y postgrespro-sdm-14-server postgrespro-sdm-14-client postgrespro-sdm-14-contrib postgrespro-sdm-14-libs pg-probackup-sdm-14 shardman-services shardman-tools
Shardman executables will be installed into the /opt/pgpro/sdm-14/bin folder, and Shardman must use the etcd cluster to store the configuration,
let’s set some environment variables to simplify cluster start process (on each node):
export PATH=$PATH:/opt/pgpro/sdm-14/bin
export SDM_STORE_ENDPOINTS=http://10.1.1.11:2379,http://10.1.1.12:2379,http://10.1.1.13:2379
export SDM_CLUSTER_NAME=cluster0
export SDM_LOG_LEVEL=debug
Setting these environment variables is equivalent to using shardmanctl utility options
(i.e., --log-level=debug, --cluster-name=cluster0, etc.) In the following examples we use environment variables.
Suppose we have chosen the cluster name of cluster0. Next step needed to be executed is to specify Shardman environment
variables in the /etcd/shardman folder (on each node):
sudo mkdir -p /etc/shardman
sudo sh -c 'cat << EOF > /etc/shardman/shardmand-cluster0.env
SDM_CLUSTER_NAME=cluster0
SDM_LOG_LEVEL=info
SDM_STORE_ENDPOINTS=http://10.1.1.11:2379,http://10.1.1.12:2379,http://10.1.1.13:2379
EOF'
Let’s generate a sample configuration with Shardman utilities (only on one node).
shardmanctl config generate > spec.json
In this step, you can make some changes to the cluster specification (configuration), i.e., change the password or PostgreSQL shared_buffer parameters and so on.
Now we have some final steps. First, let's initialize the cluster configuration in etcd (only on one [any] node).
shardmanctl init -f spec.json
Expected output is:
2023-04-18T12:30:03.043Z DEBUG cmd/common.go:100 Waiting for metadata lock...
2023-04-18T12:30:03.048Z DEBUG cluster/cluster.go:365 DataDir is not specified, setting to default /var/lib/pgpro/sdm-14/data
Enable and start the shardmand service (on each node):
sudo systemctl enable --now shardmand@cluster0
sudo systemctl status shardmand@cluster0
● shardmand@cluster0.service - deployment daemon for shardman
Loaded: loaded (/lib/systemd/system/shardmand@.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2023-04-18 12:28:18 UTC; 2min 13s ago
Docs: https://github.com/postgrespro/shardman
Main PID: 618 (shardmand)
Tasks: 10 (limit: 4571)
Memory: 32.0M
CPU: 422ms
CGroup: /system.slice/system-shardmand.slice/shardmand@cluster0.service
└─618 /opt/pgpro/sdm-14/bin/shardmand --cluster-name cluster0 --system-bus --user postgres
In this step, we assume that all previous steps were executed successfully:
the etcd cluster is working properly, times on all hosts are synchronized,
on sdm01 and sdm02, shardmand daemon is launched.
The final step should be executed as postgres Linux user:
sudo su - postgres
/opt/pgpro/sdm-14/bin/shardmanctl nodes add -n sdm01,sdm02 \
--cluster-name cluster0 \
--log-level debug \
--store-endpoints=http://10.1.1.11:2379,http://10.1.1.12:2379,http://10.1.1.13:2379
Expected output should be:
2023-04-18T12:43:11.300Z DEBUG cmd/common.go:100 Waiting for metadata lock...
2023-04-18T12:43:11.306Z INFO cluster/store.go:277 Checking if shardmand on all nodes have applied current cluster configuration
✓ Waiting for shardmand on node sdm01 to apply current configuration: success 0.000s
✓ Waiting for shardmand on node sdm02 to apply current configuration: success 0.000s
2023-04-18T12:43:11.307Z INFO add/case.go:112 Initting Stolon instances...
2023-04-18T12:43:11.312Z INFO add/case.go:170 Waiting for Stolon daemons to start... make sure shardmand daemons are running on the nodes
✓ Waiting for Stolon daemons of rg clover-1-sdm01: success 31.012s
✓ Waiting for Stolon daemons of rg clover-1-sdm02: success 0.012s
2023-04-18T12:43:42.336Z INFO add/case.go:187 Adding repgroups...
✓ waiting rg 1 config apply: done 7.014s
2023-04-18T12:43:49.444Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=1
2023-04-18T12:43:49.453Z DEBUG broadcaster/worker.go:51 repgroup 1 connect established
2023-04-18T12:43:49.453Z DEBUG commands/addrepgroup.go:575 waiting for extension lock...
2023-04-18T12:43:49.453Z DEBUG commands/addrepgroup.go:137 Loading schema into replication group rg 1
...
2023-04-18T12:44:25.665Z DEBUG rebalance/service.go:528 wait all tasks finish
2023-04-18T12:44:25.666Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=1
2023-04-18T12:44:25.666Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=2
2023-04-18T12:44:25.666Z INFO add/case.go:221 Successfully added nodes sdm01, sdm02 to the cluster
The message “Successfully added nodes sdm01, sdm02 to the cluster” means that everything is fine,
and nodes sdm01 and sdm02 are working properly.
Let's check the status of cluster nodes
shardmanctl status
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == STORE STATUS == │
├───────────┬──────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼──────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ store is OK │ │ │
└───────────┴──────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == TOPOLOGY STATUS == │
├───────────┬─────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ CROSS │ Topology placement policy is CROSS │ │ │
└───────────┴─────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == METADATA STATUS == │
├───────────┬─────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Metadata is OK │ │ │
└───────────┴─────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == SHARDMAND STATUS == │
├───────────┬─────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ shardmand on node sdm01 is OK │ │ sdm01 │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ shardmand on node sdm02 is OK │ │ sdm02 │
└───────────┴─────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == REPLICATION GROUP STATUS == │
├───────────┬─────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm01 is OK │ clover-1-sdm01 │ │
├───────────┼─────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm02 is OK │ clover-1-sdm02 │ │
└───────────┴─────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == MASTER STATUS == │
├───────────┬──────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼──────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm01 master │ clover-1-sdm01 │ sdm01:5432 │
│ │ is running on sdm01:5432 │ │ │
├───────────┼──────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm02 master │ clover-1-sdm02 │ sdm02:5432 │
│ │ is running on sdm02:5432 │ │ │
└───────────┴──────────────────────────────────────────┴─────────────────────┴─────────────────────┘
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ == DICTIONARY STATUS == │
├───────────┬──────────────────────────────────────────┬─────────────────────┬─────────────────────┤
│ STATUS │ MESSAGE │ REPLICATION GROUP │ NODE │
├───────────┼──────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm01 │ clover-1-sdm01 │ │
│ │ dictionary is OK │ │ │
├───────────┼──────────────────────────────────────────┼─────────────────────┼─────────────────────┤
│ OK │ Replication group clover-1-sdm02 │ clover-1-sdm02 │ │
│ │ dictionary is OK │ │ │
└───────────┴──────────────────────────────────────────┴─────────────────────┴─────────────────────┘
To connect to the cluster, we should get the cluster connection string on any cluster node (sdm01 or sdm02):
shardmanctl getconnstr
dbname=postgres host=sdm01,sdm02 password=!!!CHANGE_ME!!! port=5432,5432 user=postgres
And then let’s try to connect:
psql -d 'dbname=postgres host=sdm01,sdm02 password=!!!CHANGE_ME!!! port=5432,5432 user=postgres'
psql (14.7)
Type "help" for help.
postgres=#
Let's try to create a distributed table and check if everything is working properly.
postgres=# create table x(id int primary key, t text) with (distributed_by='id',num_parts=2);
CREATE TABLE
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+---------+-------------------+----------
public | x | partitioned table | postgres
public | x_0 | table | postgres
public | x_1_fdw | foreign table | postgres
(3 rows)
postgres=# \d x_0
Table "public.x_0"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
t | text | | |
Partition of: x FOR VALUES WITH (modulus 2, remainder 0)
Indexes:
"x_0_pkey" PRIMARY KEY, btree (id)
postgres=# \d x_1_fdw
Foreign table "public.x_1_fdw"
Column | Type | Collation | Nullable | Default | FDW options
--------+---------+-----------+----------+---------+-------------
id | integer | | not null | |
t | text | | | |
Partition of: x FOR VALUES WITH (modulus 2, remainder 1)
Server: shardman_rg_2
FDW options: (table_name 'x_1')
postgres=# insert into x values (6,'t'),(9,'t'),(11,'t');
INSERT 0 3
postgres=# select * from x_0;
id | t
----+---
1 | t
2 | t
(2 rows)
postgres=# select * from x_1_fdw;
id | t
----+---
3 | t
(1 row)
Everything works as expected.