The aqo module is a Postgres Pro Shardman extension for cost-based query optimization. Using machine learning methods, more precisely, a modification of the k-NN algorithm, aqo improves cardinality estimation, which can optimize execution plans and, consequently, speed up query execution.
The aqo module can collect statistics on all the executed queries, excluding queries that access system relations. The collected statistics are classified by query classes. If queries differ in their constants only, they belong to the same class. For each query class, aqo stores the cardinality quality, planning time, execution time, and execution statistics for machine learning. Based on this data, aqo builds a new query plan and uses it for the next query of the same class. aqo test runs have shown significant performance improvements for complex queries.
aqo saves all the learning data (aqo_data), queries (aqo_query_texts), query settings (aqo_queries), and query execution statistics (aqo_query_stat) to files. When aqo starts, it loads this data to shared memory. You can access aqo data through functions and views.
aqo can work in basic
and advanced modes. In the
advanced mode, when aqo is run in the
auto, learn, or intelligent
operation mode, a unique hash value, which is computed from the query tree,
is assigned to each query class to identify it and separate the collected
statistics. In the basic mode, the statistics for all untracked query classes
are stored in a common query class with a hash value that equals to 0.
Each query class has an associated separate space called feature
space, in which the statistics for this query class are collected.
This feature space is identified by a hash value (fs)
called a basic hash, which is the same for queries
that only differ in table names, so the learning data is aggregated for
such queries. Each feature space has associated feature subspaces,
where the information about selectivity and cardinality for each query plan
node is collected. Each subspace is also identified by a hash value
(fss).
The aqo extension is included into Postgres Pro Shardman. Once you have Postgres Pro Shardman installed, complete the following steps to enable aqo:
Add aqo to the
shared_preload_libraries variable in the
postgresql.conf file.
shared_preload_libraries = 'aqo'
The aqo library must be preloaded at the server startup, since
adaptive query optimization needs to be enabled per cluster.
Create the aqo extension using the following query:
CREATE EXTENSION aqo;
Enable the aqo extension by setting the
aqo.enable parameter to on.
ALTER SYSTEM SET aqo.enable = 'on';
For smooth physical replication transferring aqo data from
the primary to a replica, ensure that the same aqo versions
are installed on both. You can have different aqo versions
installed, but in this case, set aqo.wal_rw to off
on both and anticipate no replication.
To temporarily disable aqo for all queries in
the current session or for the whole cluster but do not remove or change
collected statistics and settings, you can set the
aqo.enable parameter to off.
ALTER SYSTEM SET aqo.enable = 'off'; SELECT pg_reload_conf();
Another way to disable aqo in the current database is to drop the extension.
DROP EXTENSION aqo;
To remove all the aqo data including the
collected statistics from the current database, call the
aqo_reset function as follows:
SELECT aqo_reset();
To remove all the data from the aqo storage, run the following:
SELECT aqo_reset(NULL);
If you do not want aqo to be loaded at the server restart,
remove the following line from the postgresql.conf file:
shared_preload_libraries = 'aqo'
aqo currently has the following limitations:
Query optimization with aqo does not work for queries that
contain IMMUTABLE functions.
aqo does not collect statistics on replicas because replicas are read-only. However, aqo may use query execution statistics from the primary if the replica is physical.
auto, learn, and intelligent
modes are not supposed to work for a whole cluster with queries having
a dynamically generated structure because these modes store all query
class IDs, which are different for all queries in such a workload.
Dynamically generated constants are supported, however.
aqo behavior is primarily managed using the
aqo.mode and aqo.advanced
configuration parameters. By default, the aqo.advanced
parameter is set to off. This means that aqo
works in the basic mode. When
this parameter is set to on, aqo
works in the advanced mode.
The exact operation mode is defined by the aqo.mode
parameter. The default value is auto.
To dynamically change the operation mode in your current session, run the following command:
SET aqo.mode = 'mode';
Here mode is the name of the operation mode to use.
To switch modes at the level of a server instance, run the following:
ALTER SYSTEM SET aqo.mode = 'mode';
SELECT pg_reload_conf();
In the default auto mode, aqo
collects statistics on all the executed queries for plan nodes
identified by fss, as well as learns and makes
predictions based on these statistics. The collected machine learning
data is used to correct the cardinality error for all queries whose
plans contain certain plan nodes. Statistics for all query classes
are stored in a common query class with a hash value (fs)
equals to 0.
aqo implements the least recently used
(LRU) mechanism to remove data when any of the
aqo.dsm_size_max, aqo.fs_max_items,
and aqo.fss_max_items limits is reached.
aqo uses two distinct caches for
aqo_data and aqo_query_texts.
The LRU algorithm first removes data from
aqo_query_texts, if any, and only then from
aqo_data.
In the learn mode, aqo behaves
like in the auto mode but does not use the LRU
cache mechanism. In this mode, when reaching cache limits,
aqo stops further learning. The selectivity
and cardinality for new query plan nodes are no longer collected, and new
learning data does not appear in the corresponding views.
The intelligent mode with the disabled
aqo.advanced parameter works exactly like the
learn mode.
It is not recommended to use the auto or learn
mode permanently for a whole cluster in production because this may lead to
unnecessary computational overhead and cause slight performance degradation.
Execute queries that you need to optimize several times until their plans
become good enough or stop changing and switch the mode to frozen
or controlled. In the frozen mode,
aqo makes predictions only for known queries
but does not learn from any queries. Choose this mode due to lower overhead
if tables involved in queries being optimized are changing rarely.
Otherwise, choose the controlled mode, in which
aqo makes predictions for known queries, as well
as learns from them. Note that frozen and
controlled modes should be used only after
aqo already learned in the auto
or learn mode.
The machine learning data is applied not only to the queries on which aqo learned but to all the queries whose plans contain nodes for which the statistics were collected. To prevent machine learning data from affecting other queries, use the advanced mode. Refer to the section below for details.
You can view the current query plan using the EXPLAIN
command with the ANALYZE option. For details, see the
Section 14.1.
If you often run queries of the same class, for example, your application
limits the number of possible query classes, you can set the
aqo.advanced parameter to on.
In the auto (default), learn, and
intelligent operation modes, aqo analyzes each query
execution and stores statistics on queries of different classes separately.
aqo operation in the auto
and learn modes are primarily identical. The only difference
is how these modes handle cases when cache limits are reached.
In the auto mode, aqo uses
the LRU mechanism to remove data, while in the
learn mode, aqo stops learning.
For more detailed information, refer to the description of the
basic mode.
To automatically identify which queries aqo
can optimize, switch the operation mode to intelligent.
In this mode, aqo saves new queries with
the enabled auto_tuning value. See the description
of the aqo_queries view for more details. If query
performance is not improved after 50 optimization iterations,
aqo stops working and falls back to the default
query planner.
As in the basic mode, it is not
recommended to use the auto, learn,
or intelligent mode permanently for a whole production
cluster because this may lead to overhead and slight performance degradation.
After aqo learned in one of these modes, switch
to the frozen or controlled
mode.
The advanced mode is not suitable when queries in the workload are of multiple different classes or these classes are constantly changing. In such cases, use the basic mode instead.
You must have superuser rights to access aqo views and configure advanced query parameters.
If data may change significantly between queries, you can enable the aqo.delta_rows parameter. In this case, aqo makes predictions based on row count estimates of the standard planner. For example, if data is deleted from a table, the planner estimates fewer rows than before. aqo can then use this updated information to predict a lower row count without requiring additional learning steps.
You can view all the processed query classes and their corresponding hash values in the aqo_query_texts view.
SELECT * FROM aqo_query_texts;
To find out a query class, that is, hash, and the operation mode, set the
aqo.show_hash and aqo.show_details
parameters to on and execute the query. As a result, the
output contains something like this:
... Planning Time: 23.538 ms ... Execution Time: 249813.875 ms ... Using aqo: true AQO mode: LEARN AQO advanced: OFF ... Query hash: -2439501042637610315
Each query class has its own optimization settings. These settings are shown in the aqo_queries view.
SELECT * FROM aqo_queries;
You can manually change these settings to adjust the optimization for a particular query class. For example:
-- Add a new query class to the aqo_queries view: SET aqo.enable='on'; SET aqo.advanced='on'; SET aqo.mode='intelligent'; SELECT * FROM a, b WHERE a.id=b.id; SET aqo.mode='controlled'; -- Disable auto_tuning, enable both learn_aqo and use_aqo -- for this query class: SELECT count(*) FROM aqo_queries, LATERAL aqo_queries_update(fs, NULL, true, true, false) WHERE fs = (SELECT fs FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;'); -- Run EXPLAIN ANALYZE while the plan changes: EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; -- Disable learning to stop statistics collection -- and use the optimized plan: SELECT count(*) FROM aqo_queries, LATERAL aqo_queries_update(fs, NULL, false, true, false) WHERE fs = (SELECT fs FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;');
To stop intelligent tuning for a particular query class, disable the
auto_tuning field.
SELECT count(*) FROM aqo_queries, LATERAL aqo_queries_update(fs, NULL, NULL, NULL, false) WHERE fs = 'hash';
where hash is the hash value for this query class.
As a result, aqo disables automatic change of the
learn_aqo and use_aqo settings.
To disable further learning for a particular query class, use the following command:
SELECT count(*) FROM aqo_queries, LATERAL aqo_queries_update(fs, NULL, false, NULL, false) WHERE fs = 'hash';
where hash is the hash value for this query class.
To fully disable aqo for all queries and use the standard planner, run the following:
SELECT count(*) FROM aqo_queries, LATERAL aqo_disable_class(fs, NULL) WHERE fs <> 0;
You can experiment with aqo without touching its main knowledge base. To do this, execute the command:
SET aqo.sandbox = ON;
This turns on the sandbox mode, which means that aqo will work in the isolated environment. However, if you turn on aqo.sandbox in different SQL sessions, they will use the same data.
Data obtained in the sandbox mode does not get replicated. But the sandbox mode
can be used on a standby. Moreover, the only way to train aqo
on a standby is turning on the sandbox mode when the replication is turned on,
that is, aqo.wal_rw is true.
Without the sandbox mode, aqo will work on the
standby as if aqo.mode = FROZEN, that
is, it will be able to use the existing knowledge base, but not update or
extend it.
aqo.enable (boolean) #
Enables or disables aqo. If set to
off, aqo does not work
except when the aqo.force_collect_stat
parameter is set to on.
Default: off.
aqo.mode (text)
#Sets the aqo operation mode. Possible values:
auto — collects statistics on all
the executed queries, as well as learns and makes predictions based on
these statistics. This mode prevents memory overflow by
implementing the least recently used (LRU)
mechanism to clear caches when reaching limits.
learn — behaves like the
auto mode but does not use the LRU
cache mechanism. In this mode, when reaching cache limits,
aqo stops learning.
intelligent — analyzes each query
execution and stores statistics. Statistics on queries of
different classes are stored separately. If performance is not
improved after 50 iterations, aqo
is disabled. This mode works in this way only if the
aqo.advanced parameter is set to
on, otherwise, it works exactly like the
learn mode.
controlled — only learns and makes
predictions for known queries.
frozen — makes predictions for known
queries but does not learn from any queries.
For detailed information about aqo work in all modes, refer to Usage.
Default: auto.
aqo.advanced (boolean) #Enables the advanced learning routine, which saves separate learning
statistics for each query class. Also allows fine-tuning
the use_aqo and
learn_aqo settings in the
aqo_queries view.
Fine-tuned query settings in the aqo_query view
continue to work if aqo.advanced is disabled.
For detailed information about aqo work in all modes, refer to Usage.
Default: off.
aqo.force_collect_stat (boolean) #Collects statistics on query executions in all
aqo modes and even if the aqo.enable
parameter is set to off.
Default: off.
aqo.show_details (boolean) #Adds some details to EXPLAIN output of a query,
such as the prediction or feature-subspace hash, and shows some
additional aqo-specific on-screen information.
Default: on.
aqo.show_hash (boolean) #Shows a hash value that uniquely identifies
the class of queries or class of plan nodes.
aqo uses
the native query ID to identify a query class for consistency with other extensions,
such as pg_stat_statements. So, the query ID can be taken
from the Query hash field in
EXPLAIN ANALYZE output of a query.
Default: on.
aqo.join_threshold (integer) #Ignores queries that contain smaller number of joins, which means that statistics for such queries is not collected.
Default: 0 (no queries are ignored).
aqo.learn_statement_timeout (boolean) #Learns on a plan interrupted by the statement timeout.
Default: off.
aqo.statement_timeout (integer) #Defines the initial value of the smart statement timeout,
in milliseconds, which is needed to limit the execution time when manually training aqo
on special queries with a poor cardinality forecast.
aqo can dynamically change the value of the smart
statement timeout during this training. When the cardinality estimation error on nodes
exceeds 0.1, the value of aqo.statement_timeout is automatically
incremented exponentially, but remains not greater than statement_timeout.
Default: 0.
aqo.min_neighbors_for_predicting (integer) #Defines how many samples collected in previous executions of the query will be used to predict the cardinality next time. If there are fewer of them, aqo will not make any prediction. A too large value may affect performance, but a too small value may reduce the prediction quality.
Default: 3.
aqo.predict_with_few_neighbors (boolean) #Enables aqo to make predictions with
fewer neighbors than specified by
aqo.min_neighbors_for_predicting.
When set to off, then aqo
learns, but does not make predictions until the execution count
for the query with different constants reaches 3 (default for
aqo.min_neighbors_for_predicting).
Default: on.
aqo.fs_max_items (integer) #Defines the maximum number of feature spaces that aqo can operate with. When this limit is exceeded, aqo behavior depends on its operation mode. Refer to the description of the basic mode for details. This parameter can only be set at server start.
When running a standby server, you must set this parameter to have the same or higher value as on the primary server. Otherwise, consistency of aqo data on the standby server is not guaranteed.
Default: 10000.
aqo.fss_max_items (integer) #Defines the maximum number of feature subspaces that aqo can operate with. When this limit is exceeded, aqo behavior depends on its operation mode. Refer to the description of the basic mode for details. This parameter can only be set at server start.
When running a standby server, you must set this parameter to have the same or higher value as on the primary server. Otherwise, consistency of aqo data on the standby server is not guaranteed.
Default: 100000.
aqo.querytext_max_size (integer) #Defines the maximum size of the query in the aqo_query_texts view. This parameter can only be set at server start.
When running a standby server, you must set this parameter to have the same or higher value as on the primary server. Otherwise, consistency of aqo data on the standby server is not guaranteed.
Default: 1000.
aqo.dsm_size_max (integer) #Defines the maximum size of dynamic shared memory, in megabytes, that aqo can allocate to store learning data and query texts. If set to a number that is less than the size of the saved aqo data, the server will not start. When this limit is exceeded, aqo behavior depends on its operation mode. Refer to the description of the basic mode for details. This parameter can only be set at server start.
When running a standby server, you must set this parameter to have the same or higher value as on the primary server. Otherwise, consistency of aqo data on the standby server is not guaranteed.
Default: 100.
aqo.wal_rw (boolean) #Enables physical replication and allows complete
aqo data recovery after failure. When set to
off on the primary, no data is transferred from it
to a replica. When set to off on a replica, any data
transferred from the primary is ignored. With this value, when the server
fails, data can only be restored as of the last checkpoint.
This parameter can only be set at server start.
Default: on.
aqo.sandbox (boolean) #Enables reserving a separate memory area in shared memory to be used by a primary or standby node, which allows collecting and using statistics with the data in this memory area. If enabled on the primary, the extension uses the separate shared memory area that is not replicated to the standby. Changing the value of this parameter resets the aqo cache. Only superusers can change this setting.
Default: off.
aqo.delta_rows (boolean) #Enables a learning mechanism where aqo adjusts the planner's row count estimates with its own predictions. If disabled, aqo uses its own predictions.
Default: off.
aqo_query_texts #
The aqo_query_texts view classifies all
the query classes processed by aqo.
For each query class, the view shows the text of the first analyzed
query of this class. The number of rows is limited by
aqo.fs_max_items.
Table F.1. aqo_query_texts View
| Column Name | Description |
|---|---|
queryid | The unique identifier of the query. |
dbid | The identifier of the database. |
fs | The identifier of the feature space. |
query_text | Text of the first analyzed query of the given class. The query text length is limited by aqo.querytext_max_size. |
aqo_queries #
The aqo_queries view shows optimization
settings for different query classes. One query executed in two
different databases is stored twice although the
fs is the same. The number of
rows is limited by aqo.fs_max_items.
Table F.2. aqo_queries View
| Setting | Description |
|---|---|
fs | The identifier of the feature space. |
dbid | The identifier of the database in which the query was executed. |
learn_aqo | Shows whether statistics collection for this query class is enabled. |
use_aqo | Shows whether the aqo cardinality prediction for the next execution of this query class is enabled. |
auto_tuning |
Shows whether
aqo can dynamically change When For queries with |
smart_timeout | The value of the smart statement timeout for this query class. The initial value of the smart statement timeout for any query is defined by the statement_timeout configuration parameter. |
count_increase_timeout | Shows how many times the smart statement timeout increased for this query class. |
aqo_data #
The aqo_data view shows machine
learning data for cardinality estimation refinement. The number of
rows is limited by aqo.fss_max_items. To discard
all the collected statistics for a particular query class, you
can delete all rows from aqo_data with the
corresponding fs.
Table F.3. aqo_data View
| Data | Description |
|---|---|
fs | The identifier (hash) of the feature space. |
fss | The identifier (hash) of the feature subspace. |
dbid | The identifier of the database. |
delta_rows | If true, aqo makes
predictions based on the planner's estimates, otherwise
false. |
nfeatures | Feature-subspace size for the query plan node. |
features | Logarithm of the selectivity which the cardinality prediction is based on. |
targets | Cardinality logarithm for the query plan node. |
reliability | Confidence level of the learning statistics. Equals:
|
oids | List of IDs of tables that were involved in the prediction for this node. |
tmpoids | List of IDs of temporary tables that were involved in the prediction for this node. |
aqo_query_stat #
The aqo_query_stat view shows statistics
on query execution, by query class. aqo uses this data when
auto_tuning is enabled for a
particular query class.
Table F.4. aqo_query_stat View
| Data | Description |
|---|---|
fs | The identifier of the feature space. |
dbid | The identifier of the database. |
execution_time_with_aqo | Array of execution times for queries run with aqo enabled. |
execution_time_without_aqo | Array of execution times for queries run with aqo disabled. |
planning_time_with_aqo | Array of planning times for queries run with aqo enabled. |
planning_time_without_aqo | Array of planning times for queries run with aqo disabled. |
cardinality_error_with_aqo | Array of cardinality estimation errors in the selected query plans with aqo enabled. |
cardinality_error_without_aqo | Array of cardinality estimation errors in the selected query plans with aqo disabled. |
executions_with_aqo | Number of queries run with aqo enabled. |
executions_without_aqo | Number of queries run with aqo disabled. |
aqo adds several functions to Postgres Pro Shardman catalog.
Functions aqo_queries_update,
aqo_query_texts_update, aqo_query_stat_update,
aqo_data_update and aqo_data_delete
modify data files underlying aqo views. Therefore, call
these functions only if you understand the logic of adaptive query optimization.
aqo_cleanup() → setof integerRemoves data related to query classes that are linked (may be partially) with removed relations. Returns the number of removed feature spaces (classes) and feature subspaces. Insensitive to removing other objects.
aqo_enable_class (fs bigint, dbid oid) → voidSets learn_aqo, use_aqo
and auto_tuning (only in the intelligent mode)
to true for the query class with the specified fs and
dbid. You can set dbid to NULL
instead of the ID of the current database.
aqo_disable_class (fs bigint, dbid oid) → voidSets learn_aqo, use_aqo
and auto_tuning to false for the query class with the
specified fs and dbid.
You can set dbid to NULL
instead of the ID of the current database.
aqo_drop_class (fs bigint, dbid oid) → integerRemoves all data related to the specified query class and database from the aqo
storage. You can set dbid to NULL
instead of the ID of the current database. Returns the number of records removed from the
aqo storage.
aqo_reset (dbid oid) → bigintRemoves records from the specified
database: machine learning data, query texts, statistics and query class
preferences. If dbid is omitted, removes the data from
the current database. If dbid is NULL, removes all records
from the aqo storage. Returns the number of records removed.
aqo_queries_update (fs bigint, dbid oid, learn_aqo boolean, use_aqo boolean, auto_tuning boolean) → booleanUpdates or inserts a record in a data file underlying the aqo_queries view for
the specified fs and dbid. You can set dbid to NULL
instead of the ID of the current database. NULL values for parameters being set mean
leave them as is. Note that records with a zero value of fs or dbid
cannot be updated. Returns false in case of error,
true otherwise.
aqo_query_texts_update (fs bigint, dbid oid, query_text text) → booleanUpdates or inserts a record in a data file underlying the aqo_query_texts view
for the specified fs and dbid. You can set dbid to NULL
instead of the ID of the current database. Note that records with a zero value of fs
or dbid cannot be updated. Returns false in case of error,
true otherwise.
aqo_query_stat_update (fs bigint, dbid oid, execution_time_with_aqo double precision[], execution_time_without_aqo double precision[], planning_time_with_aqo double precision[], planning_time_without_aqo double precision[], cardinality_error_with_aqo double precision[], cardinality_error_without_aqo double precision[], executions_with_aqo bigint[], executions_without_aqo bigint[]) → booleanUpdates or inserts a record in a data file underlying the aqo_query_stat view
for the specified fs and dbid. You cannot update a record
for a common feature space, i.e., for fs or dbid equal to zero.
You can set dbid to NULL
instead of the ID of the current database. Returns false in case of error,
true otherwise.
aqo_data_update (fs bigint, fss integer, dbid oid, delta_rows boolean, nfeatures integer, features double precision[][], targets double precision[], reliability double precision[], oids oid[], tmpoids oid[]) → booleanUpdates or inserts a record in a data file underlying the aqo_data view
for the specified fs, fss and dbid.
You can set dbid to NULL
instead of the ID of the current database.
If you set delta_rows to NULL,
the value of aqo.delta_rows is used.
Returns false in case of error,
true otherwise.
aqo_data_delete (fs bigint, fss integer, dbid oid, delta_rows boolean) → booleanRemoves a record from a data file underlying the aqo_data view
for the specified fs, fss and dbid.
You can set dbid to NULL
instead of the ID of the current database.
If you omit delta_rows or set it to
NULL, the value of aqo.delta_rows is used.
Returns false in case of error,
true otherwise.
aqo_memory_usage () → setof recordShows allocated and used sizes of aqo memory contexts and hash tables. Returns a table:
nameShort description of the memory context or hash table
allocated_sizeTotal size of the allocated memory
used_sizeSize of the currently used memory
aqo_cardinality_error (controlled boolean) → setof recordShows the cardinality error for the last execution of queries.
If controlled is true, shows queries executed
with aqo enabled. If controlled
is false, shows queries that were executed with aqo
disabled, but that have collected aqo
statistics. Returns a table:
numSequential number
dbidThe identifier of the database
fsThe identifier of the feature space. Can be zero or the basic hash.
erroraqo error calculated on query plan nodes
nexecsNumber of executions of queries
associated with this fs
aqo_execution_time (controlled boolean) → setof recordShows the execution time for queries.
If controlled is true, shows the
execution time of the last execution with aqo enabled.
If controlled is false, returns
the average execution time for all logged executions
with aqo disabled.
Execution time without aqo can be collected
when aqo.mode = intelligent or
aqo.force_collect_stat = on.
Returns a table:
numSequential number
dbidThe identifier of the database
fsThe identifier of the feature space. Can be zero or the basic hash.
exec_timeIf controlled
= true, last query execution time with
aqo, otherwise,
average execution time for all executions without
aqo
nexecsNumber of executions of queries
associated with this fs
Example F.1. Learning on a Query (Basic Mode)
Consider optimization of a query using aqo.
When the query is executed for the first time, it is missing in tables
underlying aqo views. So there is no data for
predicting with aqo for each plan node, and
“AQO not used” lines appear in the EXPLAIN
output:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-12-1 15:00:00+00';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=79793.72..237497.86 rows=1201002 width=33) (actual rows=9455.00 loops=1)
AQO not used, fss=4871603661380287993
Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no))
Buffers: shared hit=3713 read=50331, temp read=17210 written=17210
-> Seq Scan on segments s (cost=0.00..72572.70 rows=3941270 width=18) (actual rows=3941249.00 loops=1)
AQO not used, fss=-1745942650988724053
Buffers: shared hit=1853 read=31307
-> Hash (cost=52395.69..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1)
Buckets: 131072 Batches: 16 Memory Usage: 1058kB
Buffers: shared hit=1860 read=19024, temp written=45
-> Hash Join (cost=608.55..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1)
AQO not used, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=1860 read=19024
-> Seq Scan on boarding_passes bp (cost=0.00..45318.32 rows=2463832 width=33) (actual rows=2463832.00 loops=1)
AQO not used, fss=1362775811343989307
Buffers: shared hit=1656 read=19024
-> Hash (cost=475.98..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 501kB
Buffers: shared hit=204
-> Seq Scan on flights f (cost=0.00..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1)
AQO not used, fss=3484507337497244877
Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 11164
Buffers: shared hit=204
Planning:
Buffers: shared hit=53
Using aqo: true
AQO mode: AUTO
AQO advanced: OFF
Query hash: 0
JOINS: 2
(32 rows)
If there is no information on a certain node in the aqo_data
view, aqo will add the appropriate record
there for future learning and predictions except for nodes with
fss=0 in the EXPLAIN output.
As each of features and
targets in the aqo_data
view is a logarithm to base e,
to get the actual value, raise e to this power.
For example: exp(9.154298981092557):
demo=# select * from aqo_data;
fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
----+----------------------+-------+------------+-----------+---------------------------------------------+----------------------+-------------+---------------------+---------
0 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223}} | {9.268043082104471} | {1} | {17452} |
0 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
0 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028}} | {9.154298981092557} | {1} | {17452,17438} |
0 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
0 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414}} | {9.154298981092557} | {1} | {17488,17452,17438} |
(5 rows)
When the query is executed for the second time, aqo recognizes the query and makes a prediction. Pay attention to the cardinality predicted by aqo and the value of aqo error (“error=0%”).
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-12-1 15:00:00+00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1608.83..38142.58 rows=9455 width=33) (actual rows=9455.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=27340 read=22325
-> Nested Loop (cost=608.83..36197.08 rows=3940 width=33) (actual rows=3151.67 loops=3)
AQO: rows=9455, error=0%, fss=4871603661380287993
Join Filter: (s.flight_id = f.flight_id)
Buffers: shared hit=27340 read=22325
-> Hash Join (cost=608.40..34249.71 rows=3940 width=37) (actual rows=3151.67 loops=3)
AQO: rows=9455, error=0%, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=2360 read=18932
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3)
AQO: rows=2463832, error=0%, fss=1362775811343989307
Buffers: shared hit=1748 read=18932
-> Hash (cost=475.98..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3)
Buckets: 16384 Batches: 1 Memory Usage: 501kB
Buffers: shared hit=612
-> Seq Scan on flights f (cost=0.00..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3)
AQO: rows=10594, error=0%, fss=3484507337497244877
Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 11164
Buffers: shared hit=612
-> Index Only Scan using segments_pkey on segments s (cost=0.43..0.48 rows=1 width=18) (actual rows=1.00 loops=9455)
AQO not used (early terminated), fss=-5182591529139042748
Index Cond: ((ticket_no = bp.ticket_no) AND (flight_id = bp.flight_id))
Heap Fetches: 0
Index Searches: 9455
Buffers: shared hit=24980 read=3393
Planning:
Buffers: shared hit=53
Using aqo: true
AQO mode: AUTO
AQO advanced: OFF
Query hash: 0
JOINS: 2
(36 rows)
Let's change a constant in the query, and you will notice that the prediction is made with an error:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-11-20 15:00:00+00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1608.83..38142.58 rows=9455 width=33) (actual rows=438899.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1307156 read=30841
-> Nested Loop (cost=608.83..36197.08 rows=3940 width=33) (actual rows=146299.67 loops=3)
AQO: rows=9455, error=-4542%, fss=4871603661380287993
Join Filter: (s.flight_id = f.flight_id)
Buffers: shared hit=1307156 read=30841
-> Hash Join (cost=608.40..34249.71 rows=3940 width=37) (actual rows=146299.67 loops=3)
AQO: rows=9455, error=-4542%, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=1521 read=19771
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3)
AQO: rows=2463832, error=0%, fss=1362775811343989307
Buffers: shared hit=909 read=19771
-> Hash (cost=475.98..475.98 rows=10594 width=4) (actual rows=12593.00 loops=3)
Buckets: 16384 Batches: 1 Memory Usage: 571kB
Buffers: shared hit=612
-> Seq Scan on flights f (cost=0.00..475.98 rows=10594 width=4) (actual rows=12593.00 loops=3)
AQO: rows=10594, error=-19%, fss=3484507337497244877
Filter: (scheduled_departure > '2025-11-20 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 9165
Buffers: shared hit=612
-> Index Only Scan using segments_pkey on segments s (cost=0.43..0.48 rows=1 width=18) (actual rows=1.00 loops=438899)
AQO not used (early terminated), fss=-5182591529139042748
Index Cond: ((ticket_no = bp.ticket_no) AND (flight_id = bp.flight_id))
Heap Fetches: 0
Index Searches: 438899
Buffers: shared hit=1305635 read=11070
Planning:
Buffers: shared hit=53
Using aqo: true
AQO mode: AUTO
AQO advanced: OFF
Query hash: 0
JOINS: 2
(36 rows)
However, instead of recalculating features and
targets, aqo added
new values of selectivity and cardinality for this query to
aqo_data:
demo=# SELECT * FROM aqo_data;
fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
----+----------------------+-------+------------+-----------+---------------------------------------------------------------------------------------+---------------------------------------+-------------+---------------------+---------
0 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223},{-0.5463556163769266}} | {9.268043082104471,9.440896383005846} | {1,1} | {17452} |
0 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
0 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028},{-0.5463556163769266,-9.987736784981028}} | {9.154298981092557,12.9920245972504} | {1,1} | {17452,17438} |
0 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
0 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414},{-0.5463556163769266,-14.672062325711414}} | {9.154298981092557,12.9920245972504} | {1,1} | {17488,17452,17438} |
(5 rows)
Now the prediction has a small error of about 2%, which can be explained by a calculation error:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-11-20 15:00:00+00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=39355.89..164831.92 rows=429336 width=33) (actual rows=438899.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1619 read=52833, temp read=21086 written=21152
-> Parallel Hash Join (cost=38355.89..120898.32 rows=178890 width=33) (actual rows=146299.67 loops=3)
AQO: rows=429336, error=-2%, fss=4871603661380287993
Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no))
Buffers: shared hit=1619 read=52833, temp read=21086 written=21152
-> Parallel Seq Scan on segments s (cost=0.00..49581.96 rows=1642187 width=18) (actual rows=1313749.67 loops=3)
AQO: rows=3941249, error=0%, fss=-1745942650988724053
Buffers: shared read=33160
-> Parallel Hash (cost=34274.54..34274.54 rows=178890 width=37) (actual rows=146299.67 loops=3)
Buckets: 131072 Batches: 8 Memory Usage: 4928kB
Buffers: shared hit=1619 read=19673, temp written=2812
-> Hash Join (cost=633.24..34274.54 rows=178890 width=37) (actual rows=146299.67 loops=3)
AQO: rows=429336, error=-2%, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=1619 read=19673
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3)
AQO: rows=2463832, error=0%, fss=1362775811343989307
Buffers: shared hit=1007 read=19673
-> Hash (cost=475.98..475.98 rows=12581 width=4) (actual rows=12593.00 loops=3)
Buckets: 16384 Batches: 1 Memory Usage: 571kB
Buffers: shared hit=612
-> Seq Scan on flights f (cost=0.00..475.98 rows=12581 width=4) (actual rows=12593.00 loops=3)
AQO: rows=12581, error=-0%, fss=3484507337497244877
Filter: (scheduled_departure > '2025-11-20 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 9165
Buffers: shared hit=612
Planning:
Buffers: shared hit=53
Using aqo: true
AQO mode: AUTO
AQO advanced: OFF
Query hash: 0
JOINS: 2
(36 rows)
We can modify the query by adding some table to the JOIN list. In this case, aqo
will predict the cardinality of nodes on which it learned before.
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
JOIN tickets t ON t.ticket_no = s.ticket_no
WHERE f.scheduled_departure > '2025-12-1 15:00:00+00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1609.40..40084.93 rows=9666 width=33) (actual rows=9455.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=53810 read=24225 written=1
-> Nested Loop (cost=609.40..38118.33 rows=4028 width=33) (actual rows=3151.67 loops=3)
AQO not used, fss=3232027643707566962
Buffers: shared hit=53810 read=24225 written=1
-> Nested Loop (cost=608.97..36240.71 rows=4028 width=47) (actual rows=3151.67 loops=3)
AQO: rows=9666, error=2%, fss=4871603661380287993
Join Filter: (s.flight_id = f.flight_id)
Buffers: shared hit=28230 read=21435
-> Hash Join (cost=608.54..34249.84 rows=4028 width=37) (actual rows=3151.67 loops=3)
AQO: rows=9666, error=2%, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=1006 read=20286
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3)
AQO: rows=2463832, error=0%, fss=1362775811343989307
Buffers: shared hit=394 read=20286
-> Hash (cost=475.98..475.98 rows=10605 width=4) (actual rows=10594.00 loops=3)
Buckets: 16384 Batches: 1 Memory Usage: 501kB
Buffers: shared hit=612
-> Seq Scan on flights f (cost=0.00..475.98 rows=10605 width=4) (actual rows=10594.00 loops=3)
AQO: rows=10605, error=0%, fss=3484507337497244877
Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 11164
Buffers: shared hit=612
-> Index Only Scan using segments_pkey on segments s (cost=0.43..0.48 rows=1 width=18) (actual rows=1.00 loops=9455)
AQO not used (early terminated), fss=-5182591529139042748
Index Cond: ((ticket_no = bp.ticket_no) AND (flight_id = bp.flight_id))
Heap Fetches: 0
Index Searches: 9455
Buffers: shared hit=27224 read=1149
-> Index Only Scan using tickets_pkey on tickets t (cost=0.43..0.47 rows=1 width=14) (actual rows=1.00 loops=9455)
AQO not used (early terminated), fss=1810536986390200978
Index Cond: (ticket_no = bp.ticket_no)
Heap Fetches: 0
Index Searches: 9455
Buffers: shared hit=25580 read=2790 written=1
Planning:
Buffers: shared hit=121 read=11 dirtied=1
Using aqo: true
AQO mode: AUTO
AQO advanced: OFF
Query hash: 0
JOINS: 3
(45 rows)
Example F.2. Using the aqo_query_stat View
The aqo_query_stats view shows statistics on the query
planning time, query execution time and cardinality error. Based on this data
you can make a decision whether to use aqo
predictions for different query classes.
Let's query the aqo_query_stats view:
demo=# SELECT * FROM aqo_query_stat \gx
-[ RECORD 1 ]-----------------+----------------------------------------------------------------
fs | 0
dbid | 16556
execution_time_with_aqo | {1.194791831,0.497019753,0.372696583,0.071416851}
execution_time_without_aqo | {1.194049191,1.003504607}
planning_time_with_aqo | {0.004099525,0.000548588,0.000518923,0.000545041}
planning_time_without_aqo | {0.000568455,0.000472447}
cardinality_error_with_aqo | {0.47163214679982596,0,1.5696609066434117,0.009035905503851183}
cardinality_error_without_aqo | {0.47163214679982596,1.9379745948665572}
executions_with_aqo | 4
executions_without_aqo | 2
The retrieved data is for the query from
Example F.1,
which was executed once without aqo
for each of the parameters f.scheduled_departure >
'2025-11-20 15:00:00+00' and
f.scheduled_departure > '2025-12-1 15:00:00+00'
and twice with
aqo for each of these parameters.
It is clear that with
aqo, the cardinality error decreases to
0.009, while the minimum cardinality error without
aqo is 0.471. Besides, the execution time
with aqo is lower than without it.
So the conclusion is that aqo learns well
on this query, and the prediction can be used for this query class.
Example F.3. Using aqo in the Advanced Mode
The advanced mode allows a more flexible control over aqo. When this mode is activated, that is,
demo=# SET aqo.advanced = on;
aqo will collect the machine learning data separately for each query executed. For example:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-12-1 15:00:00+00';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=79793.72..237497.86 rows=1201002 width=33) (actual rows=9455.00 loops=1)
AQO not used, fss=4871603661380287993
Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no))
Buffers: shared hit=4958 read=49086, temp read=17210 written=17210
-> Seq Scan on segments s (cost=0.00..72572.70 rows=3941270 width=18) (actual rows=3941249.00 loops=1)
AQO not used, fss=-1745942650988724053
Buffers: shared hit=2116 read=31044
-> Hash (cost=52395.69..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1)
Buckets: 131072 Batches: 16 Memory Usage: 1058kB
Buffers: shared hit=2842 read=18042, temp written=45
-> Hash Join (cost=608.55..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1)
AQO not used, fss=4705493075117122362
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=2842 read=18042
-> Seq Scan on boarding_passes bp (cost=0.00..45318.32 rows=2463832 width=33) (actual rows=2463832.00 loops=1)
AQO not used, fss=1362775811343989307
Buffers: shared hit=2638 read=18042
-> Hash (cost=475.98..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 501kB
Buffers: shared hit=204
-> Seq Scan on flights f (cost=0.00..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1)
AQO not used, fss=3484507337497244877
Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 11164
Buffers: shared hit=204
Planning:
Buffers: shared hit=463
Using aqo: true
AQO mode: AUTO
AQO advanced: ON
Query hash: 6166891552805381787
JOINS: 2
(32 rows)
Now this query is stored in aqo_data with a non-zero fs
(fs is equal to the query hash by default):
demo=# SELECT * FROM aqo_data;
fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
---------------------+----------------------+-------+------------+-----------+---------------------------------------------+----------------------+-------------+---------------------+---------
6166891552805381787 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028}} | {9.154298981092557} | {1} | {17452,17438} |
6166891552805381787 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414}} | {9.154298981092557} | {1} | {17488,17452,17438} |
6166891552805381787 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
6166891552805381787 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223}} | {9.268043082104471} | {1} | {17452} |
6166891552805381787 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
(5 rows)
We can make a few settings individually for this query. These are
values of learn_aqo, use_aqo
and auto_tuning in the
aqo_queries view:
demo=# SELECT * FROM aqo_queries;
fs | dbid | learn_aqo | use_aqo | auto_tuning | smart_timeout | count_increase_timeout
---------------------+-------+-----------+---------+-------------+---------------+------------------------
6166891552805381787 | 16556 | t | t | f | 0 | 0
0 | 0 | f | f | f | 0 | 0
(2 rows)
Let's set use_aqo to false:
demo=# SELECT aqo_queries_update(6166891552805381787, NULL, NULL, false, NULL); aqo_queries_update -------------------- t (1 row)
Now we change a constant in the query:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
WHERE f.scheduled_departure > '2025-11-20 15:00:00+00';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=84966.87..244434.20 rows=1426685 width=33) (actual rows=438899.00 loops=1)
AQO not used, fss=0
Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no))
Buffers: shared hit=5142 read=48902, temp read=20132 written=20132
-> Seq Scan on segments s (cost=0.00..72572.70 rows=3941270 width=18) (actual rows=3941249.00 loops=1)
AQO not used, fss=0
Buffers: shared hit=2208 read=30952
-> Hash (cost=52420.60..52420.60 rows=1426685 width=37) (actual rows=438899.00 loops=1)
Buckets: 131072 Batches: 16 Memory Usage: 2925kB
Buffers: shared hit=2934 read=17950, temp written=2967
-> Hash Join (cost=633.46..52420.60 rows=1426685 width=37) (actual rows=438899.00 loops=1)
AQO not used, fss=0
Hash Cond: (bp.flight_id = f.flight_id)
Buffers: shared hit=2934 read=17950
-> Seq Scan on boarding_passes bp (cost=0.00..45318.32 rows=2463832 width=33) (actual rows=2463832.00 loops=1)
AQO not used, fss=0
Buffers: shared hit=2730 read=17950
-> Hash (cost=475.98..475.98 rows=12599 width=4) (actual rows=12593.00 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 571kB
Buffers: shared hit=204
-> Seq Scan on flights f (cost=0.00..475.98 rows=12599 width=4) (actual rows=12593.00 loops=1)
AQO not used, fss=0
Filter: (scheduled_departure > '2025-11-20 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 9165
Buffers: shared hit=204
Planning:
Buffers: shared hit=53
Using aqo: false
AQO mode: AUTO
AQO advanced: ON
Query hash: 6166891552805381787
JOINS: 2
(32 rows)
aqo was not used for this query, but there is
new data in the aqo_data view:
demo=# SELECT * FROM aqo_data;
fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
---------------------+----------------------+-------+------------+-----------+---------------------------------------------------------------------------------------+---------------------------------------+-------------+---------------------+---------
6166891552805381787 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028},{-0.5463556163769266,-9.987736784981028}} | {9.154298981092557,12.9920245972504} | {1,1} | {17452,17438} |
6166891552805381787 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414},{-0.5463556163769266,-14.672062325711414}} | {9.154298981092557,12.9920245972504} | {1,1} | {17488,17452,17438} |
6166891552805381787 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
6166891552805381787 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223},{-0.5463556163769266}} | {9.268043082104471,9.440896383005846} | {1,1} | {17452} |
6166891552805381787 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
(5 rows)
The use_aqo setting does not apply to other queries.
After executing another query twice, we can see that aqo learns on it and makes prediction for it:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF)
SELECT bp.*
FROM segments s
JOIN flights f ON f.flight_id = s.flight_id
JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id
JOIN tickets t ON t.ticket_no = s.ticket_no
WHERE f.scheduled_departure > '2025-12-1 15:00:00+00';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=68355.43..129435.66 rows=9455 width=33) (actual rows=9455.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=34424 read=48398, temp read=25496 written=25656
-> Nested Loop (cost=67355.43..127490.16 rows=3940 width=33) (actual rows=3151.67 loops=3)
AQO: rows=9455, error=0%, fss=3232027643707566962
Buffers: shared hit=34424 read=48398, temp read=25496 written=25656
-> Parallel Hash Join (cost=67355.00..125653.57 rows=3940 width=47) (actual rows=3151.67 loops=3)
AQO: rows=9455, error=0%, fss=4871603661380287993
Hash Cond: ((bp.flight_id = f.flight_id) AND (bp.ticket_no = s.ticket_no))
Buffers: shared hit=6286 read=48166, temp read=25496 written=25656
-> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3)
AQO: rows=2463832, error=0%, fss=1362775811343989307
Buffers: shared hit=3098 read=17582
-> Parallel Hash (cost=54501.93..54501.93 rows=616138 width=22) (actual rows=492910.33 loops=3)
Buckets: 131072 Batches: 16 Memory Usage: 6144kB
Buffers: shared hit=3188 read=30584, temp written=7556
-> Hash Join (cost=608.40..54501.93 rows=616138 width=22) (actual rows=492910.33 loops=3)
AQO: rows=1478731, error=0%, fss=4547398436029445256
Hash Cond: (s.flight_id = f.flight_id)
Buffers: shared hit=3188 read=30584
-> Parallel Seq Scan on segments s (cost=0.00..49581.96 rows=1642187 width=18) (actual rows=1313749.67 loops=3)
AQO: rows=3941249, error=0%, fss=-1745942650988724053
Buffers: shared hit=2576 read=30584
-> Hash (cost=475.98..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3)
Buckets: 16384 Batches: 1 Memory Usage: 501kB
Buffers: shared hit=612
-> Seq Scan on flights f (cost=0.00..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3)
AQO: rows=10594, error=0%, fss=3484507337497244877
Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone)
Rows Removed by Filter: 11164
Buffers: shared hit=612
-> Index Only Scan using tickets_pkey on tickets t (cost=0.43..0.47 rows=1 width=14) (actual rows=1.00 loops=9455)
AQO not used (early terminated), fss=1810536986390200978
Index Cond: (ticket_no = bp.ticket_no)
Heap Fetches: 0
Index Searches: 9455
Buffers: shared hit=28138 read=232
Planning:
Buffers: shared hit=85
Using aqo: true
AQO mode: AUTO
AQO advanced: ON
Query hash: 5639045936347396923
JOINS: 3
(45 rows)
Example F.4. Using the Sandbox Mode
SET aqo.sandbox = ON; SET aqo.enable = ON; SET aqo.advanced = OFF; -- Clean up the sandbox knowledge base without touching the main data SELECT aqo_reset(); EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF) SELECT bp.* FROM segments s JOIN flights f ON f.flight_id = s.flight_id JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id JOIN tickets t ON t.ticket_no = s.ticket_no WHERE f.scheduled_departure > '2025-12-1 15:00:00+00'; -- Be executing the previous query until plans get stabilized ... -- Copy data obtained with aqo.advanced = OFF from sandbox CREATE TABLE aqo_data_sandbox AS SELECT * FROM aqo_data; SET aqo.sandbox = OFF; SELECT aqo_data_update (fs, fss, dbid, delta_rows, nfeatures, features, targets, reliability, oids, tmpoids) FROM aqo_data_sandbox WHERE fs = 0; DROP TABLE aqo_data_sandbox;
Oleg Ivanov