sdmspec.json — Shardman initialization file
sdmspec.json
shardman-ladle uses the
sdmspec.json configuration file during
Shardman cluster initialization.
A shardman-utils package provides a sample
configuration file.
sdmspec.json file consists of these major documents:
LadleSpec, ClusterSpec and
StolonSpec, embedded in ClusterSpec.
LadleSpec determines basic filesystem paths used by
Shardman and global settings of the cluster.
It is primarily used by shardman-ladle when
it deploys new nodes and by shardman-bowl
when it determines which services should be running on the current node.
ClusterSpec contains database-related settings, i. e.,
administrative and replication user logins and authentication method,
FDW parameters and stolon configuration
(StolonSpec).
ClusterSpec is used by
shardman-monitor service and
shardmanctl tool, which work with the DBMS and do not
care about how underlying nodes are organized.
LadleSpec ParametersStoreConnInfo
Object that has the following string properties:
Endpoints, CAFile,
CertFile, and Key.
It is used by services to figure out how to connect to etcd.
Usually, StoreConnInfo is not specified explicitly,
but is filled in automatically based on shardman-ladle
options related to etcd store that are used during
Shardman cluster initialization.
RepfactorInteger determining how many replicas shardman-ladle should configure for each DBMS.
PlacementPolicy
String determining the policy of placing DBMS instances. Currently,
clover placement policy is only supported. With
this placement policy, nodes are grouped in
clovers, where each node is running the master DBMS server
and replicas for all other nodes in the clover. The number of nodes in
a clover is determined by Repfactor and
equals Repfactor + 1.
PGsInitialPortPorts starting with this integer are assigned to PostgeSQL instances.
ProxiesInitialPort
Ports starting with this integer
are assigned to stolon proxies
if UseProxy is set to true.
MonitorsNumInteger number of shardman-monitor processes per cluster. Each monitor updates foreign server definitions according to etcd metadata and performs distributed deadlock detection.
ClusterSpec ParametersPgSuAuthMethod
Authentication method used by the administrative user to connect to
the DBMS. Can be any authentication method supported by PostgreSQL.
md5 is currently recommended. The default is
trust.
PgSuUsername
Name of the administrative DBMS user. Created on cluster
initialization. Defaults to the name of the effective user running
shardman-ladle init.
PgSuPasswordPassword for the administrative DBMS user.
PgReplAuthMethod
Authentication method used by the replication user to connect to
the DBMS. Can be any authentication method supported by PostgreSQL.
md5 is currently recommended. The default is
trust.
PgReplUsername
Name of the replication DBMS user. Created on cluster
initialization. The default is repluser.
PgReplPasswordPassword for the replication DBMS user.
UseProxyBoolean parameter that specifies whether proxy should be used. Using proxy makes switchovers from master to replicas more transparent, but has some performance impact. The default is false.
StolonSpec
stolon cluster specification.
For more details, see StolonSpec Parameters
FDWOptions
This object contains FDW settings that shardman-monitor
automatically applies to foreign servers
corresponding to Shardman replication groups.
Can define async_capable, batch_size,
fetch_size, fdw_startup_cost,
fdw_tuple_cost and postgres_fdw.
settings.
StolonSpec Parameters
The StolonSpec specification can include all usual
stolon options described in
Stolon Cluster Specification.
However, the following options should be carefully tuned for
a Shardman cluster.
pgHBA
JSON array of pg_hba.conf strings. The default one allows
PgSuUsername user's access from anywhere with
PgSuAuthMethod authentication method. If the value of
defaultSUReplAccessMode is strict,
pg_hba.conf strings must explicitly allow PgSuUsername and
PgReplUsername users' access from all
Shardman cluster nodes.
synchronousReplication
Determines whether replicas should use synchronous replication.
Should be true in
a Shardman cluster.
The default is true.
maxSynchronousStandbys
Maximum number of required synchronous standbys when synchronous
replication is enabled.
Should be ≥ Repfactor in
a Shardman cluster.
The default is Repfactor.
automaticPgRestart
Determines whether a DBMS instance should be automatically restarted after
a change of the pgParameters hash table that requires a restart.
Should be enabled in a Shardman cluster.
The default is true.
pgParametersHash table that determines PostgreSQL settings, including Shardman-specific settings.
The following settings in pgParameters are
Shardman-specific:
enable_csn_snapshot (boolean)
Enables/disables CSN-based tracking of the transaction visibility for a snapshot.
PostgreSQL uses the clock timestamp as a CSN, so enabling CSN-based snapshots can be useful for implementing global snapshots and global transaction visibility.
When this parameter is enabled, PostgreSQL creates the
pg_csn directory under PGDATA to keep
track of CSN and XID mappings.
This parameter has significant performance overhead.
The default is off.
csn_snapshot_defer_time
(integer)
Specifies the minimal age of records that are allowed to be vacuumed, in seconds.
All global transactions must start on all participant nodes within
csn_snapshot_defer_time seconds after start,
otherwise, they are aborted with a
“csn snapshot too old” error.
The default is 5.
postgres_fdw.optimize_cursors
(boolean)
Sets postgres_fdw to try fetching the first portion of cursor data immediately after declaration and delay the cursor closing.
This postgres_fdw parameter forces it to avoid closing cursors after the end of scan. Cursors are closed at the end of transaction.
The default is off.
postgres_fdw.use_twophase
(boolean)
Sets postgres_fdw to use the two-phase commit (2PC) protocol for distributed transactions.
This postgres_fdw parameter forces it to use a two-phase commit if the transaction touches several nodes.
Temporary tables cannot be used in 2PC transactions.
The default is off.
shardman.batch_size
(integer)
Specifies batch size used by shardman extension and tools.
When tuples are inserted to or deleted from global tables, Shardman extension will combine new rows (or conditions on rows to be deleted) in batches of this size.
For best performancce this parameter should match batch_size parameter,
specified in FDWOptions.
The default is 100.
Possible values are from 1 to 65535.
shardman.broadcast_ddl
(boolean)
Sets Shardman extension to broadcast DDL statements to all replication groups.
When this parameter is on, Shardman extension broadcasts supported DDL statements to all replication groups if it does make sense for those statements. You can enable/disable this behavior anytime.
The default is off.
shardman.num_parts
(integer)
Specifies the default number of sharded table partitions.
A sharded table has this default
number of partitions
unless num_parts
is specified in CREATE TABLE.
To allow scaling, shardman.num_parts should be larger
than the expected maximum
number of nodes in a Shardman cluster.
The default is 20.
Possible values are from 1 to 1000.
shardman.rgid
(integer)
Specifies the replication group ID of a Shardman node.
This parameter is set by Shardman utilities when the node is added to the cluster and should never be changed manually.
The default is -1.
shardman.sync_schema
(boolean)
Sets Shardman to propagate all DDL statements that touch sharded and global relations to all replication groups.
When this parameter is on, Shardman broadcasts all supported utility statements touching sharded and global relations to all replication groups. It is not recommended to turn this off.
The default is on.
This is the contents of an example sdmspec.json configuration file:
{
"LadleSpec":{
"PGsInitialPort": 5442,
"ProxiesInitialPort": 5432,
"Repfactor":1,
"MonitorsNum":2
},
"ClusterSpec":{
"PgSuAuthMethod": "md5",
"PgSuUsername":"postgres",
"PgSuPassword": "ChangeMe",
"PgReplUsername":"repluser",
"PgReplAuthMethod":"md5",
"PgReplPassword":"ChangeMe",
"UseProxy":true,
"FDWOptions" :{
"async_capable": "true",
"batch_size": "100",
"fetch_size": "50000",
"fdw_tuple_cost": "0.2"
},
"StolonSpec":{
"pgHBA":[
"host replication postgres 0.0.0.0/0 md5",
"host replication postgres ::0/0 md5",
"host all all 0.0.0.0/0 md5"
],
"pgParameters":{
"log_statement":"none",
"log_line_prefix":"%m [%r][%p]",
"log_min_messages":"INFO",
"wal_level":"logical",
"max_worker_processes":"16",
"max_logical_replication_workers": "9",
"shared_preload_libraries":"postgres_fdw, shardman",
"default_transaction_isolation":"repeatable read",
"max_prepared_transactions":"200",
"csn_snapshot_defer_time":"15",
"enable_csn_snapshot":"true",
"enable_mergejoin":"off",
"enable_nestloop":"off",
"fdw_estimate_as_hashjoin":"on",
"enable_partitionwise_aggregate":"on",
"enable_partitionwise_join":"on",
"postgres_fdw.optimize_cursors":"on",
"postgres_fdw.estimate_indexscan":"on",
"postgres_fdw.use_twophase":"on",
}
}
}
}
From LadleSpec, you can see that
a Shardman cluster initialized with this spec file
has Repfactor equal to 1 (one replica for each master) and 2
shardman-monitor processes, which work concurrently
(a large cluster can require
more monitors). ClusterSpec shows that two special
users are created in this cluster — superuser postgres
and replication user repluser with
ChangeMe passwords. They can be authenticated using the
md5 authorization method. stolon
proxies are used to forward application queries to the master server
in a replication group. One postgres_fdw fetch operation
will get up to 50000 rows from the remote server. The cost of fetching one row
is set to a reasonably high value to make PostgreSQL
planner consider conditions pushdown-attractive. pg_hba.conf settings
allow postgres user access from anywhere using a replication
protocol; all other users can access any database from anywhere. Since
defaultSUReplAccessMode is not set to
strict, stolon will automatically add
entries that allow PgSuUsername
user's (postgres) access to any database from anywhere and
PgReplUsername user's (repluser)
replication access from anywhere.
Several important Shardman-specific parameters
are set in the pgParameters hash table. These are:
wal_level
Should be set to logical for
Shardman to work correctly.
shared_preload_libraries
Should include postgres_fdw and shardman extensions in the specified order.
max_logical_replication_workers
Should be rather high since the rebalance process uses up to
max(
concurrent threads.
max_replication_slots,
max_logical_replication_workers, max_worker_processes,
max_wal_senders)/3
max_prepared_transactions
Should be rather high since Shardman utilities
use the 2PC protocol. If postgres_fdw.use_twophase
is true, postgres_fdw also uses 2PC.
enable_csn_snapshotShould be enabled to achieve a true repeatable read isolation level in a distributed system.
csn_snapshot_defer_time
All global transactions must start on all participant nodes within
csn_snapshot_defer_time seconds after start,
otherwise they will be aborted.
enable_partitionwise_aggregateenable_partitionwise_join
Set to on to enable optimizations for partitioned tables.