The cluster considered consists of four shards. This is how the data partitions of the main sharded table are distributed across shards.
For the ticket_no sharding key:
tickets_0 — shard-1 (cluster node node1)
tickets_1 — shard-2 (cluster node node2)
tickets_2 — shard-3 (cluster node node3)
tickets_3 — shard-4 (cluster node node4)
For the book_ref sharding key:
bookings_0 — shard-1 (cluster node node1)
bookings_1 — shard-2 (cluster node node2)
bookings_2 — shard-3 (cluster node node3)
bookings_3 — shard-4 (cluster node node4)
The examples below are provided for the book_ref
sharding key, but the code in the subsections is suitable for the
ticket_no sharding key.
Do not treat this code as optimal or use it in a production environment. It only shows how to implement creation of a connection pull to work with a Postgres Pro Shardman cluster.
What is common for all the examples is the cluster connection string, which must contain node names, TCP port numbers, user name and password, database name for connection and a set of session parameters.
You can get this string using the shardmanctl utility. In the simplest case, the string looks like this:
$ shardmanctl getconnstr dbname=postgres host=node1,node2,node3,node4 port=5432,5432,5432,5432
You can get this string to connect to cluster nodes or to create the connection pool in applications.
A few convenient functions and views are implemented in Postgres Pro Shardman that add cluster observability by:
Listing global tables
Listing sharded tables
Listing global sequences
Finding the shard number from the value of the sharding key
Perfroming ANALYZE for all the global and sharded
tables in the cluster
To display all global tables in the cluster, use the shardman.global_tables
view:
postgres=# select
relname as table_name,
nspname as schema
from shardman.global_tables;
table_name | schema
------------+----------
aircrafts | bookings
seats | bookings
airports | bookings
flights | bookings
(4 rows)
To display information on all the sharded tables in the cluster, query
the shardman.sharded_tables view as follows:
postgres=# select
relname as table_name,
nparts as partitions,
colocated_with::oid::regclass::text as colocated_with,
nspname as schema
from shardman.sharded_tables;
table_name | partitions | colocated_with | schema
-----------------+------------+----------------+----------
bookings | 4 | | bookings
ticket_flights | 4 | bookings | bookings
tickets | 4 | bookings | bookings
boarding_passes | 4 | bookings | bookings
(4 rows)
To display all the global sequences in the cluster, use the
shardman.sequence view:
postgres=# select
seqns as schema,
seqname as sequence_name,
seqmin as min_value,
seqmax as max_value,
seqblk as bulk_size
from shardman.sequence;
schema | sequence_name | min_value | max_value | bulk_size
----------+-------------------------+-----------+---------------------+-----------
bookings | flights_flight_id_seq | 262145 | 9223372036854775807 | 65536
(1 rows)
To display the name of the partition that contains data and the replication
group name, call the shardman.get_partition_for_value()
function. For example, for book_ref = 0369E5:
postgres=# select * from shardman.get_partition_for_value(
'bookings'::regclass,
'0369E5'::character(6));
rgid | local_nspname | local_relname | remote_nspname | remote_relname
------+---------------+---------------+----------------+----------------
1 | bookings | bookings_0 | bookings | bookings_0
This output shows that the data is in the bookings_0
partition of the bookings table and is located
on the node where the query was executed.
Let's create a query to display the name of the server where the partition with data is located. If we connect to the server that contains the partition, the server name is displayed as “current server”. If the data is on a different server, the hostname of the shard master is displayed:
SELECT p.rgid,
local_relname AS partition_name,
CASE
WHEN r.srvid IS NULL THEN 'current server'
ELSE (SELECT (SELECT split_part(kv, '=', 2)
FROM (SELECT unnest(fs.srvoptions) as kv) x
WHERE split_part(kv, '=', 1) = 'host')
FROM shardman.repgroups rg
JOIN pg_catalog.pg_foreign_server AS fs ON fs.oid = rg.srvid
WHERE rg.id = p.rgid)
END AS server_name
FROM shardman.get_partition_for_value('bookings'::regclass, '0369E5'::character(6)) p
JOIN shardman.repgroups AS r ON
r.id = p.rgid;
rgid | partition_name | server_name
------+----------------+----------------
1 | bookings_0 | current server
(1 row)
Execution of this query with another value of the sharding key, 0369E6, produces the output:
rgid | partition_name | server_name
------+----------------+-------------
4 | bookings_3_fdw | node4
(1 row)
It is clear that the partition is on the node4 node.
Also note that the shardman.rgid parameter allows
you to find the number of the node with the connection session. To do
this, execute the query:
SELECT pg_catalog.current_setting('shardman.rgid');
You can use this value to determine the location of connection sessions for queries like discussed in this section.
The shardman.get_partition_for_value() is mainly
designed for administration purposes, to better understand the data
topology.
As a rule, do not use administration functions when writing SQL code for data access.
You can get the list of all sharded tables in the bookings
schema, together with the number of partitions and their distribution
across servers (shards) from Postgres Pro Shardman metadata
on any cluster node.
Consider the following query:
SELECT p.rel::regclass::text AS table_name,
p.pnum,
p.rgid,
r.srvid,
fs.srvname
FROM shardman.parts p
JOIN shardman.repgroups r
ON p.rgid = r.id
LEFT OUTER JOIN pg_foreign_server fs
ON r.srvid = fs.oid;
To learn how the data is distributed, let's combine this query with a subquery from Section 21.2.1.4:
SELECT p.rel::regclass AS table_name,
st.nparts AS total_parts,
p.pnum AS num_part,
CASE
WHEN r.srvid IS NULL THEN 'connected server'
ELSE
(SELECT split_part(kv, '=', 2)
FROM (SELECT unnest(fs.srvoptions) AS kv) x
WHERE split_part(kv, '=', 1) = 'host')
END AS server_name
FROM shardman.parts p
JOIN shardman.repgroups r
ON p.rgid = r.id
LEFT JOIN shardman.sharded_tables st
ON p.rel = st.rel
LEFT JOIN pg_foreign_server fs
ON r.srvid = fs.oid
WHERE st.nspname = 'bookings'
ORDER BY table_name, num_part, server_name;
The output format is the table name, number of table partitions, partition number and server name:
table_name | total_parts | num_part | server_name
--------------------------+-------------+----------+------------------
bookings.bookings | 4 | 0 | connected server
bookings.bookings | 4 | 1 | node2
bookings.bookings | 4 | 2 | node3
bookings.bookings | 4 | 3 | node4
bookings.ticket_flights | 4 | 0 | connected server
bookings.ticket_flights | 4 | 1 | node2
bookings.ticket_flights | 4 | 2 | node3
bookings.ticket_flights | 4 | 3 | node4
bookings.tickets | 4 | 0 | connected server
bookings.tickets | 4 | 1 | node2
bookings.tickets | 4 | 2 | node3
bookings.tickets | 4 | 3 | node4
bookings.boarding_passes | 4 | 0 | connected server
bookings.boarding_passes | 4 | 1 | node2
bookings.boarding_passes | 4 | 2 | node3
bookings.boarding_passes | 4 | 3 | node4
To collect statistics for sharded and global tables, call the
shardman.global_analyze() function. This function
first collects statistics for all local partitions of sharded tables on
each node and then broadcasts this statistics to other nodes. For a global
table, the function first collects statistics on a certain node and then
the statistics is broadcast to all the other nodes.