Shardman databases require periodic maintenance, known as vacuuming.
For many installations, it is sufficient to let vacuuming be performed
by the autovacuum daemon. As in PostgreSQL
installations, the autovacuum daemon will issue ANALYZE
commands automatically whenever the content of a table has changed
sufficiently. The ANALYZE command transfers statistics
for foreign partitions from remote nodes.
Rebalance process can move or copy data between cluster nodes. After this operation, all transferred objects are analyzed automatically. As usual, local statistics are gathered, and remote statistics are fetched from foreign servers.
When running for the entire database, the ANALYZE
command calculates statistics for local partitions of sharded tables
on the current node and for global tables. It also transfers statistics
for foreign partitions available on remote nodes. However, note that
remote nodes may lack the statistics.
Still, running ANALYZE on each node may not give
you full cluster-wide statistics, as they may have not been calculated
on some nodes. Also, global tables are analyzed multiple times (for
each node).
Instead, use the shardman.global_analyze()
function to perform cluster-wide analysis. First, this function executes
ANALYZE on each node for all local partitions of
sharded tables, then sends these statistics to other nodes. Next, it
selects one node per a global table and runs ANALYZE
of this table on the selected node. Gathered statistics are broadcast
to all other nodes in the cluster.
When ANALYZE is run on a global table, only statistics
for the corresponding local table on the current node are updated.
When ANALYZE is run on a sharded table, statistics
on its local partitions are updated, statistics for foreign partitions
are transferred from remote nodes, if remote nodes have it. If
statistics are missing on remote nodes, they are not calculated. If
you run ANALYZE for the local table, only its
statistics are updated.
When ANALYZE is run on a foreign table directly
and a remote node does not have any statistics for the corresponding
local table, this local table is analyzed remotely. Then statistics
are transferred from the remote node.
The VACUUM command running for the whole database
performs vacuuming on the current node.
You can broadcast this command to perform cluster-wide vacuuming on
all nodes. Set the shardman.broadcast_ddl
configuration parameter to on, run VACUUM,
and set shardman.broadcast_ddl to off.
When VACUUM is run on a sharded or global table,
the statement is broadcast automatically. On all nodes, the command
vacuums all table partitions of the sharded table and all local tables
of the global table.