During the execution of distributed queries, Shardman sends derived SQL queries
to remote nodes that hold data partitions involved in the query execution.
Let's call these SQL queries query fragments. Shardman sends such queries using
the postgres_fdw extension. The node that queries the distributed table is named
the coordinator, while the nodes that accept query fragments are named
shards.
When the pgpro_stats extension is enabled on a Shardman cluster node, it collects statistics about local queries only. Information about distributed queries, initiated by this node, is incomplete because it misses data about remote query fragments. The statistics concerning queries initiated by other nodes is also ambiguous because there is no simple way for a user to determine the distributed query to which the fragment corresponds.
To address these issues, pgpro_stats for Shardman includes additional features.
When a node receives a query fragment, it saves its statistics into a separate
shared hash table. This information is not returned when querying the pgpro_stats_statements
view. Periodically and asynchronously, each node sends this information from the separate table to
the coordinator corresponding to the query. The coordinator aggregates the statistical data
obtained from query fragments with the statistics of its parent query, which is the query
initiated by the client.
The pgpro_stats extension starts a separate background worker. This worker is responsible for sending the accumulated statistics to the coordinator nodes either every 5 seconds or when triggered by the guard latch. The collecting function sets this latch when the hash table is almost full.
To reduce network traffic initiated by statistics sender there is a compression applied to statistics data sent.
Compression method can be selected by the GUC pgpro_stats.transport_compression.
Each node stores the integral number of statistics entries received from the shard node and the timestamp of the last receive. When a coordinator node receives a statistics message, it updates these values, which are accessible using SQL interface.
There are additional pgpro_stats SQL functions introduced by
Shardman additions described in Section 8.2
and configuration parameters described in the section called “pgpro_stats parameters”.