Shardman includes some additional features and modules, imported from Postgres Pro Enterprise, namely AQO (Adaptive query optimization), CFS (Compressed File System) and pgpro_stats module.
AQO (Adaptive query optimization) is a Shardman extension which uses query execution statistics for improving cardinality estimation.
To turn on AQO:
Add aqo to shared_preload_libraries
parameter in sdmspec.json.
Create extension aqo on all nodes.
SET shardman.broadcast_ddl TO ON;
CREATE EXTENSION aqo;
RESET shardman.broadcast_ddl;
Set aqo.mode to learn and run queries which you want to
optimize with EXPLAIN ANALYZE until plan stops changing.
BEGIN;
SET aqo.mode = 'learn';
EXPLAIN ANALYZE <query>
RESET aqo.mode;
COMMIT;
Note that aqo statistics is separate on all nodes in Shardman
cluster. So you'll have to repeat this process on each node in the cluster. Alternatively,
you can set aqo.mode to learn and run your application
for some time and later turn it back to default mode (controlled).
AQO will not be activated if you join less than aqo.join_threshold
relations (3 by default).
Complete AQO documentation can be found here.
CFS enables page level compression in Shardman. Compression can only be enabled for separate tablespaces. To compress a tablespace, you should enable the compression option when creating this tablespace. For example:
CREATE TABLESPACE data LOCATION '/mnt/data-{rgid}' WITH (global, compression='zlib');
Now you can create tables and indexes in this tablespace or move existing table or index into it.
CREATE TABLE pgbench_branches (
bid integer NOT NULL PRIMARY KEY USING INDEX TABLESPACE data,
bbalance integer,
filler character(88)
)
WITH (distributed_by = 'bid') TABLESPACE data;
Function cfs_compression_ratio() returns real compression ration for all segments of the compressed relation.
However, it returns NaN for partitioned and foreign tables, so it works only for local partitions of
a sharded table.
Complete CFS documentation can be found here.
The pgpro_stats extension provides a means for tracking planning and execution statistics of all SQL statements executed by a server. At this moment pgpro_stats tracks the statistics only for the statements originated from the current node and excludes the statistics for internal requests in case of distributed queries. This means that it shows information only for local part of a query execution.
Complete pgpro_stats documentation can be found here.