In Postgres Pro AXE, the deletion of Parquet
files is a two-stage process. First, you must mark
snapshots
referenced by entries from the pga_data_file
metadata table as expired to exclude associated Parquet
files from analytical queries. Then, you can delete Parquet files.
Before performing this instruction, ensure that you are assigned the
metastore_admin role. For more information, refer
to Chapter 14.
To delete Parquet files:
Mark snapshots as expired in one of the following ways:
To mark snapshots created before a certain date and time as expired, execute the following query:
SELECT metastore.expire_snapshot(threshold_date_and_time);
Where threshold_date_and_time is the
threshold value for the snapshot creation date and time.
The snapshot creation date and time are contained in the
snapshot_time column of the
pga_snapshot
metadata table.
Supported date and time formats:
Date in the yyyy-mm-dd format.
Example 20.3.
2025-11-28
Date in the yyyy-mm-dd format and time in the hh:mm:ss format.
Example 20.4.
2025-11-28 12:22:46
Date in the yyyy-mm-dd format and time in the hh:mm:ss format with the time zone.
Example 20.5.
2025-11-28 12:22:46+03
Date in the yyyy-mm-dd format and time in the hh:mm:ss.ssssss format with time zone, where the fractional part of seconds must have no more than 6 digits.
Example 20.6.
2025-11-28 12:22:46.123456+03
Example 20.7.
SELECT metastore.expire_snapshot('2025-11-13 12:22:46.123456+03');
To mark snapshots with certain IDs as expired, execute the following query:
SELECT metastore.expire_snapshot('[list_of_snapshot_IDs]');
Where list_of_snapshot_IDs is a
comma-separated list of snapshot IDs.
Snapshot IDs are contained in the
snapshot_id column of the
pga_snapshot metadata table.
Example 20.8.
SELECT metastore.expire_snapshot('[1,2,3,4]');
Once the query is executed, pgpro_metastore performs the following actions:
Marks snapshots as expired if their entries from the
pga_snapshot
metadata table have snapshot_time
values that are less than threshold_date_and_time
or have snapshot_id values that are equal
to one of the specified snapshot IDs.
Finds entries in the pga_data_file metadata table
whose begin_snapshot values reference
expired snapshots and creates associated entries in the
pga_files_scheduled_for_deletion
metadata table if these entities were not already created by
previous expire_snapshot calls.
Other metadata tables
whose begin_snapshot values reference
expired snapshots are not impacted.
Recreate Postgres Pro views from analytical tables associated with Parquet files so that these files are excluded from analytical queries.
Postgres Pro views can be recreated only during periods of zero user activity to prevent potential data loss.
Delete Parquet files:
SELECT metastore.delete_expired_files();
Once the query is executed, pgpro_metastore performs the following actions:
Deletes Parquet files from the
storage and
associated entries from
pga_files_scheduled_for_deletion,
pga_data_file, and
pga_file_column_statistics
metadata tables.
Recalculates statistics for entries
associated with deleted Parquet files in
pga_table_stats
and pga_table_column_stats
metadata tables.
Recalculates column_order values
for entries associated with deleted Parquet files in the
pga_column
metadata table.