Multi-segment insertion is currently experimental and is not recommended for using in production. It has the following limitations:
AFTER INSERT triggers might fail to fire for
newly inserted rows. The issue occurs because the inserted data is
temporarily located in a buffer and is not immediately available
for reading during multi-segment insertion.
Multi-segment insertion supports triggers that do not read the inserted data, such as sending notifications, audit logging, etc.
Multi-segment insertion does not support triggers that read or modify the newly inserted data.
Postgres Pro Shardman cannot enforce unique constraints when loading data to the same table in multiple threads that might result in unique constraint violations.
Multi-segment inserts are safe to use in the following scenarios:
Restoring data from pg_dump
Loading data from a consistent backup
To avoid unique constraint violations in other cases, use a third-party tool.
Postgres Pro Shardman allows you to bulk insert data to the same table in multiple threads simultaneously bypassing the Postgres Pro Shardman buffer manager. This feature is designed primarily for data migration and restoring from backup.
To use multi-segment inserts, Postgres Pro Shardman does not need any special preparations or custom parameters when connecting a new client. Any client backend process can start insertions in multi-segment mode.
Before writing data to a table, specify the name of the table in the multi-segment-relname parameter. You can set this parameter multiple times in a session so that a single client backend can do multi-segment insertions to several tables, switching between its dedicated table segments.
To avoid lock conflicts, each backend creates a dedicated table segment for inserting data. The backends that are not performing multi-segment insertion cannot access this segment and the corresponding table even for reading. The backends that are currently using the multi-segment mechanism can read data from the dedicated table segments that do not belong to them, but they can write data only to their own segments. When a segment is full, a backend immediately creates another segment and continues writing data to the new segment.
Before performing the first INSERT, a backend acquires
a session-level exclusive lock on the table specified in the multi-segment-relname parameter. This lock blocks access
to the table for all processes, including autovacuum, except the backends
that perform multi-segment inserts to this table. All operations except
for multi-segment insertion are not available.
The table is unlocked when all backends that performed multi-segment inserts to this table disconnect from the database.
Data pages that have been inserted via the multi-segment mechanism act
like regular Postgres Pro Shardman data pages: they are synced to disk during the
COMMIT or CHECKPOINT operations for
a higher level of safety.
It is recommended to use the multi-segment mechanism only for the
INSERT operations. Using it with
the UPDATE and DELETE operations
might lead to unexpected behavior.
Multi-segment inserts cannot be used with the following relations:
Catalog relations
TOAST relations
Temporary relations