The pg_query_state module allows checking the current state of query execution in a working backend process.
After the stage of
planning/optimization, a plan
tree is built for each non-utility query
(SELECT/INSERT/
UPDATE/DELETE). The plan tree is an
imperative representation of a declarative SQL query. The
EXPLAIN ANALYZE query allows showing execution statistics
collected from each node of the plan tree, for example, total execution
time, number of rows passed to upper nodes, etc. However, these statistics
are collected only after the query is executed. The
pg_query_state module allows showing the
up-to-date statistics of a query being executed in an external backend
process. The output format is almost identical to the regular
EXPLAIN ANALYZE query. Therefore, you can track the
progress of the query execution itself. This module can explore an external
backend process and determine its actual state. It is particularly helpful
when the backend process executes a heavy query or gets stuck.
The pg_query_state module allows you to:
Detect complex queries, which is possible together with other monitoring tools.
Observe query execution.
To install pg_query_state, complete the following steps:
Add pg_query_state to the
shared_preload_libraries variable in the
postgresql.conf file:
shared_preload_libraries = 'pg_query_state'
Reload the database server for the changes to take effect.
Create the pg_query_state extension:
CREATE EXTENSION pg_query_state;
If you need to upgrade the pg_query_state
module, use the
ALTER EXTENSION UPDATE
command.
pg_query_state (pid integer, verbose boolean, costs boolean, timing boolean, buffers boolean, triggers boolean, format text) returns table
#
Extracts the state of the current query from a backend process with
the specified pid. Since parallel queries can
spawn multiple workers and a function call produces nested subqueries
so that the execution state may be viewed as a stack of running
queries, the value returned by the pg_query_state
function has the following format:
TABLE (pidinteger,frame_numberinteger,query_texttext,plantext,leader_pidinteger)
It represents the tree structure consisting of a leader process and
its spawned workers identified by pid. Each worker
refers to the leader through the leader_pid value.
For the leader process, this value is null.
The state of each process is represented as a stack of function calls.
Each frame of that stack is specified as a correspondence between
frame_number starting from zero, query_text,
and plan with the current statistics columns.
Therefore, you can see the states of the main query and queries generated from function calls for the leader process and all workers spawned from it.
During execution, some plan tree nodes can make full execution loops.
Therefore, statistics for such nodes consist of two parts: average
statistics for previous loops like in the
EXPLAIN ANALYZE output and statistics for the
current loop being executed.
The function has the following optional arguments:
verbose — when set to
true, the function uses
EXPLAIN VERBOSE to print the plan. The
default value is false.
costs — when set to
true, the function prints execution costs
for each node. The default value is false.
timing — when set to
true, the function prints execution time
statistics for each node. If time statistics collection is
disabled, the corresponding message is displayed. The default
value is false.
buffers — when set to
true, the function prints buffers usage
statistics. If buffer statistics collection is disabled, the
corresponding message is displayed. The default value is
false.
triggers — when set to
true, the function includes trigger statistics
in resulting plan trees. The default value is
false.
format — the
EXPLAIN format to be used for plan printing.
The allowed values are text,
xml, json, and
yaml. The default value is
text.
If the called backend process is not executing any query, the function returns an information message about the backend state taken from the pg_stat_activity view in case this information is available.
This function can be called only by members of the role that owns the backend process or superusers.
pg_progress_bar (pid integer) returns float
#
Returns the current progress of query execution. This function
extracts the state of the current query from the backend process with
the specified pid. Then, it collects the actual
and expected row counts from all plan nodes and computes the overall
progress ratio for the entire execution tree. As a result, the
function returns a numeric value from 0 to
1 that means the query execution progress. This
function is useful for visual representation of the query execution
progress.
If there is no information about the state of the current query or it is impossible to calculate the query execution progress, the corresponding message is displayed.
This function can be called only by members of the role that owns the backend process or superusers.
pg_progress_bar_visual (pid integer, delay integer) returns void
#
Cyclically calculates the current progress of query execution and
returns it in the text form. The function extracts the state of the
current query from the backend process with the specified
pid with the interval defined in the
delay parameter (in seconds).
This function is a cyclic version of the
pg_progress_bar
function but returns void.
If there is no information about the current query state or it is impossible to calculate the query execution progress, the corresponding message is displayed.
This function can be called only by members of the role that owns the backend process or superusers.
There are several configuration parameters that can be used to configure the module itself and the collection of specific statistics during query execution. The parameters are set on the called side before running queries whose states are attempted to extract.
pg_query_state.enable (boolean)
#Enables or disables the pg_query_state module.
Default: true.
pg_query_state.enable_timing (boolean)
#Enables or disables collection of execution time statistics for each node. If this parameter is disabled, the calling side cannot get timing statistics.
Default: false.
pg_query_state.enable_buffers (boolean)
#Enables or disables collection of buffer usage statistics. If this parameter is disabled, the calling side cannot get buffer usage statistics.
Default: false.
Set the maximum number of parallel workers on the Gather
node to 2:
postgres=# SET max_parallel_workers_per_gather = 2;
Assume that one backend process with PID equal to 49265
performs a simple query:
postgres=# SELECT pg_backend_pid(); pg_backend_pid ---------------- 49265 (1 row) postgres=# SELECT count(*) FROM foo JOIN bar ON foo.c1=bar.c1;
The other backend process can extract an intermediate execution state of this query:
postgres=# \x
postgres=# SELECT * FROM pg_query_state(49265);
-[ RECORD 1 ]+-------------------------------------------------------------------------------------------------------------------------
pid | 49265
frame_number | 0
query_text | SELECT count(*) FROM foo JOIN bar ON foo.c1=bar.c1;
plan | Finalize Aggregate (Current loop: actual rows=0, loop number=1) +
| -> Gather (Current loop: actual rows=0, loop number=1) +
| Workers Planned: 2 +
| Workers Launched: 2 +
| -> Partial Aggregate (Current loop: actual rows=0, loop number=1) +
| -> Nested Loop (Current loop: actual rows=12, loop number=1) +
| Join Filter: (foo.c1 = bar.c1) +
| Rows Removed by Join Filter: 5673232 +
| -> Parallel Seq Scan on foo (Current loop: actual rows=12, loop number=1) +
| -> Seq Scan on bar (actual rows=500000 loops=11) (Current loop: actual rows=173244, loop number=12)
leader_pid | (null)
-[ RECORD 2 ]+-------------------------------------------------------------------------------------------------------------------------
pid | 49324
frame_number | 0
query_text | <parallel query>
plan | Partial Aggregate (Current loop: actual rows=0, loop number=1) +
| -> Nested Loop (Current loop: actual rows=10, loop number=1) +
| Join Filter: (foo.c1 = bar.c1) +
| Rows Removed by Join Filter: 4896779 +
| -> Parallel Seq Scan on foo (Current loop: actual rows=10, loop number=1) +
| -> Seq Scan on bar (actual rows=500000 loops=9) (Current loop: actual rows=396789, loop number=10)
leader_pid | 49265
-[ RECORD 3 ]+-------------------------------------------------------------------------------------------------------------------------
pid | 49323
frame_number | 0
query_text | <parallel query>
plan | Partial Aggregate (Current loop: actual rows=0, loop number=1) +
| -> Nested Loop (Current loop: actual rows=11, loop number=1) +
| Join Filter: (foo.c1 = bar.c1) +
| Rows Removed by Join Filter: 5268783 +
| -> Parallel Seq Scan on foo (Current loop: actual rows=11, loop number=1) +
| -> Seq Scan on bar (actual rows=500000 loops=10) (Current loop: actual rows=268794, loop number=11)
leader_pid | 49265
In the example above, a working backend process spawns two parallel workers
with PIDs equal to 49324 and 49323.
Their leader_pid values show that these workers belong
to the main backend process. The Seq Scan node shows
statistics on the passed loops (the average number of rows passed to
Nested Loop and the number of passed loops) and
statistics on the current loop. Other nodes show statistics only on the
current loop as this loop is first (loop number=1).
Assume that the first backend process calls a function:
postgres=# SELECT n_join_foo_bar();
The other backend process can get the following output:
postgres=# SELECT * FROM pg_query_state(49265);
-[ RECORD 1 ]+------------------------------------------------------------------------------------------------------------------
pid | 49265
frame_number | 0
query_text | SELECT n_join_foo_bar();
plan | Result (Current loop: actual rows=0, loop number=1)
leader_pid | (null)
-[ RECORD 2 ]+------------------------------------------------------------------------------------------------------------------
pid | 49265
frame_number | 1
query_text | SELECT (select count(*) FROM foo JOIN bar ON foo.c1=bar.c1)
plan | Result (Current loop: actual rows=0, loop number=1) +
| InitPlan 1 (returns $0) +
| -> Aggregate (Current loop: actual rows=0, loop number=1) +
| -> Nested Loop (Current loop: actual rows=51, loop number=1) +
| Join Filter: (foo.c1 = bar.c1) +
| Rows Removed by Join Filter: 51636304 +
| -> Seq Scan on bar (Current loop: actual rows=52, loop number=1) +
| -> Materialize (actual rows=1000000 loops=51) (Current loop: actual rows=636355, loop number=52)+
| -> Seq Scan on foo (Current loop: actual rows=1000000, loop number=1)
leader_pid | (null)
The first row corresponds to the function call, the second row represents the query executed in the function body.
You can get resulting plans in a different format, for example, JSON:
postgres=# SELECT * FROM pg_query_state(pid := 49265, format := 'json');
-[ RECORD 1 ]+------------------------------------------------------------
pid | 49265
frame_number | 0
query_text | SELECT * FROM n_join_foo_bar();
plan | { +
| "Plan": { +
| "Node Type": "Function Scan", +
| "Parallel Aware": false, +
| "Function Name": "n_join_foo_bar", +
| "Alias": "n_join_foo_bar", +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 0 +
| } +
| } +
| }
leader_pid | (null)
-[ RECORD 2 ]+------------------------------------------------------------
pid | 49265
frame_number | 1
query_text | SELECT (SELECT count(*) FROM foo JOIN bar ON foo.c1=bar.c1)
plan | { +
| "Plan": { +
| "Node Type": "Result", +
| "Parallel Aware": false, +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 0 +
| }, +
| "Plans": [ +
| { +
| "Node Type": "Aggregate", +
| "Strategy": "Plain", +
| "Partial Mode": "Simple", +
| "Parent Relationship": "InitPlan", +
| "Subplan Name": "InitPlan 1 (returns $0)", +
| "Parallel Aware": false, +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 0 +
| }, +
| "Plans": [ +
| { +
| "Node Type": "Nested Loop", +
| "Parent Relationship": "Outer", +
| "Parallel Aware": false, +
| "Join Type": "Inner", +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 610 +
| }, +
| "Join Filter": "(foo.c1 = bar.c1)", +
| "Rows Removed by Join Filter": 610072944, +
| "Plans": [ +
| { +
| "Node Type": "Seq Scan", +
| "Parent Relationship": "Outer", +
| "Parallel Aware": false, +
| "Relation Name": "bar", +
| "Alias": "bar", +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 611 +
| } +
| }, +
| { +
| "Node Type": "Materialize", +
| "Parent Relationship": "Inner", +
| "Parallel Aware": false, +
| "Actual Rows": 1000000, +
| "Actual Loops": 610, +
| "Current loop": { +
| "Actual Loop Number": 611, +
| "Actual Rows": 73554 +
| }, +
| "Plans": [ +
| { +
| "Node Type": "Seq Scan", +
| "Parent Relationship": "Outer", +
| "Parallel Aware": false, +
| "Relation Name": "foo", +
| "Alias": "foo", +
| "Current loop": { +
| "Actual Loop Number": 1, +
| "Actual Rows": 1000000 +
| } +
| } +
| ] +
| } +
| ] +
| } +
| ] +
| } +
| ] +
| } +
| }
leader_pid | (null)
The first backend process executes the following sample query:
postgres=# INSERT INTO table_name SELECT generate_series(1,10000000);
Another backend process can get the execution process of the query above as follows:
postgres=# SELECT pid FROM pg_stat_activity WHERE query LIKE 'insert%';
pid
-------
23877
(1 row)
postgres=# SELECT pg_progress_bar(23877);
pg_progress_bar
-----------------
0.6087927
(1 row)
To display a visual representation of the query execution progress, run the following query:
postgres=# SELECT pg_progress_bar_visual(23877, 1);
Progress = 0.043510
Progress = 0.085242
Progress = 0.124921
Progress = 0.168168
Progress = 0.213803
Progress = 0.250362
Progress = 0.292632
Progress = 0.331454
Progress = 0.367509
Progress = 0.407450
Progress = 0.448646
Progress = 0.488171
Progress = 0.530559
Progress = 0.565558
Progress = 0.608039
Progress = 0.645778
Progress = 0.654842
Progress = 0.699006
Progress = 0.735760
Progress = 0.787641
Progress = 0.832160
Progress = 0.871077
Progress = 0.911858
Progress = 0.956362
Progress = 0.995097
Progress = 1.000000
pg_progress_bar_visual
------------------------
1
(1 row)