While evaluating performance of DML statements, it is important to understand how they are processed in Shardman.
First of all, the execution of INSERT significantly
differs from the execution of UPDATE and DELETE
statements. The behavior of INSERT for sharded tables is
controlled by the batch_size foreign server option, which
can be set in FDWOptions section of
Shardman configuration
file.
If batch_size is greater than 0,
an INSERT in the same statement of several values
that fall into the same foreign partition leads to the values being grouped together
in batches of the specified size. Remote INSERT statements are prepared
with the necessary number of parameters and then are executed with the
given values. If the number of values does not match the number of prepared
arguments, the modified statement with the necessary number of parameters is prepared
again. A batch insert optimization can fail if a transaction inserts records one by one
or records routed to different foreign tables are intermixed
in one INSERT statement. A batch is formed for a single
foreign modify operation. It is sent to the remote server when the batch is filled
or when the modify operation is over. The modify operation is over when
we start routing tuples to another sharded table partition.
So, for bulk load, inserting multiple values in a single INSERT
command or using COPY is recommended (as COPY
is optimized in a similar way). Large batch_size
values allow issuing less INSERT
statements on remote side and so significantly reduce communication
costs. However, during construction of parameters for
prepared INSERT statements, all
inserted values should be copied to libpq-allocated
memory. This can lead to unrestricted memory usage on the query coordinator
side when several large text
or bytea objects are loaded.
UPDATE and DELETE statements
can be executed in a direct or indirect mode. A direct mode is used
when a statement can be directly sent to a foreign server.
In this mode, to modify a table on a remote server, a new statement
is created based on the original ModifyTable
plan node. Using a direct update is not always possible.
In particular, it is impossible when some conditions should be evaluated
locally. In this case, a much less efficient indirect modification
is used. An indirect modification includes several statements.
The first one is SELECT FOR UPDATE to lock remote
rows. The second one is an actual UPDATE or
DELETE, which is prepared once and then
executed with different parameters for each row of the
SELECT FOR UPDATE statement result
after local filters are applied to the result.
Evidently, direct modifications are much more efficient.
You can easily identify whether a DML statement is going to be executed in a direct or indirect mode looking at the query plan. A typical example of an indirect modification is:
EXPLAIN VERBOSE DELETE FROM pgbench_history
WHERE bid = 20 AND mtime > '2023-03-14 10:00:00'::timestamptz;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Delete on public.pgbench_history (cost=100.00..142.66 rows=0 width=0)
Foreign Delete on public.pgbench_history_17_fdw pgbench_history_1
Remote SQL: DELETE FROM public.pgbench_history_17 WHERE ctid = $1
-> Foreign Scan on public.pgbench_history_17_fdw pgbench_history_1 (cost=100.00..142.66 rows=4 width=10)
Output: pgbench_history_1.tableoid, pgbench_history_1.ctid
Filter: (pgbench_history_1.mtime > '2023-03-14 10:00:00+03'::timestamp with time zone)
Remote SQL: SELECT mtime, ctid FROM public.pgbench_history_17 WHERE ((bid = 20)) FOR UPDATE
If we had chosen another type for the string constant, this would become a direct update.
EXPLAIN VERBOSE DELETE FROM pgbench_history
WHERE bid = 20 AND mtime > '2023-03-14 10:00:00'::timestamp;
explain verbose delete from pgbench_history where bid = 20 and mtime > '2023-03-14 10:00:00'::timestamp;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Delete on public.pgbench_history (cost=100.00..146.97 rows=0 width=0)
Foreign Delete on public.pgbench_history_17_fdw pgbench_history_1
-> Foreign Delete on public.pgbench_history_17_fdw pgbench_history_1 (cost=100.00..146.97 rows=4 width=10)
Remote SQL: DELETE FROM public.pgbench_history_17 WHERE ((mtime > '2023-03-14 10:00:00'::timestamp without time zone)) AND ((bid = 20))
We see that in a direct update mode, only one statement is executed on the remote server.
The shardman.gt_batch_size configuration parameter, which you can tune, defines the size of an intermediate buffer used before sending data to a remote server.
INSERT uses the binary protocol and creates batches of
the shardman.gt_batch_size size. Large values of
the buffer size enable sending fewer network
requests on the remote side and thus substantially reduce the connection
costs. On the other hand, large values of this parameter can increase
memory consumption on the query coordinator side. Therefore, when specifying
the buffer size, it is important to achieve a compromise between the
connection costs and the allocated memory size.
For UPDATE, a query for each column and each row is
created on the coordinator and sent to remote nodes.
For DELETE, a query for a batch of data of the
shardman.gt_batch_size size is
created on the coordinator and sent to remote nodes.