The pgpro_ilm extension solves the problem similar to
Oracle's information lifecycle management (ILM),
which analyzes usage frequency of data in a database and performs certain actions
with rarely used tables, for example, moves tables to a lower-cost and slower
storage.
pgpro_ilm allows you to define rules for moving tables.
A rule can have the NO_MODIFICATION or NO_ACCESS
type. A NO_MODIFICATION rule defines an action to perform
on the table if it was not modified (by UPDATE,
INSERT, DELETE, or TRUNCATE
commands), but could be read (by SELECT) for a certain
period of time. A NO_ACCESS rule defines an action to perform
on the table if none of the above commands was executed on this table for a
certain period of time.
pgpro_ilm provides functions to manage the rules
and apply them.
pgpro_ilm gets per-user statistics on when tables were
last accessed from pgpro_usage
functions and views. For the ILM purposes, maintenance operations
with tables, such as taking regular backups, archiving, etc., should not be taken
into account. These operations are performed by particular users, e.g.,
postgres, and such users must be excluded when checking
whether pgpro_ilm rules apply.
pgpro_ilm provides functions to manage the user
exclusion list.
The current version of pgpro_ilm has the following limitations:
The only action supported is moving the relation to another tablespace.
Regular tables, indexes, and partitions of partitioned tables are only supported.
pgpro_ilm is a built-in extension
included in Postgres Pro Enterprise. To enable
pgpro_ilm, create the
extension using the following query:
CREATE EXTENSION pgpro_ilm;
For pgpro_ilm to work, make sure that the
pgpro_usage extension
is installed, as explained in Section F.58.1.
To configure moving tables to a lower-cost storage, pgpro_ilm provides an SQL interface that consists of several functions.
The following functions manage pgpro_ilm rules:
add_rule( iv_object_name text, iv_rule_type text, iv_period interval, iv_action text, iv_parameter text ) returns void
Adds a new rule.
Arguments:
iv_object_name — name of the relation.
Can be qualified or not. If the name is unqualified,
the schema will be determined automatically by following
a search path (as explained in Section 5.10.4)
when saving the rule to an internal table.
iv_rule_type — type of the rule:
NO_ACCESS or NO_MODIFICATION.
iv_period — period after which the
rule gets activated.
iv_action — action to perform.
Possible value: ALTER_TS
— move the relation to another tablespace.
iv_parameter — action parameter.
Possible value is the tablespace where to move the table.
remove_rule( iv_object_name text, iv_rule_type text, iv_period interval, iv_action text ) returns void
Removes a rule or rules.
Arguments:
iv_object_name — name of the relation.
Can be qualified or not. If the name is unqualified,
the schema will be determined automatically by following
a search path (as explained in Section 5.10.4)
when searching the rule to remove.
iv_rule_type — type of the rule:
NO_ACCESS or NO_MODIFICATION.
NULL value removes rules with any rule type.
iv_period — period after which the
rule gets activated. NULL value removes rules with any
value of the period.
iv_action — action to perform.
Possible value: ALTER_TS
— move the relation to another tablespace.
NULL value removes rules with any action.
get_rules( [iv_object_name text] ) returns table
Returns the list of all rules if called without the parameter or the list of rules that apply to the specified relation. For tables that are partitions of partitioned tables, this list also includes rules that apply to them indirectly, through the rules that apply to different-level parent tables. The list is returned in a table of the following format:
Column Type Description |
|---|
Name of the schema that a relation is in |
Name of the relation |
The value defined in the |
Type of a rule: |
Period after which the rule gets activated |
Action to perform |
Action parameter |
Arguments:
iv_object_name — name of the relation.
Can be qualified or not. If the name is unqualified,
the schema will be determined automatically by following
a search path (as explained in Section 5.10.4).
The following functions manage the list of users to be excluded when checking pgpro_ilm rules:
set_exclude_users( iv_exclude_access name[], iv_exclude_modification name[] ) returns void
Resets the user exclusion list and creates a new one.
Arguments:
iv_exclude_access — array of names
of users to exclude when checking NO_ACCESS
rules.
iv_exclude_modification — array
of names of users to exclude when checking
NO_MODIFICATION rules.
get_exclude_users() returns table
Returns the user exclusion list in a table of the following format:
Column Type Description |
|---|
Name of a user. |
Whether to exclude this user from checking
|
Whether to exclude this user from checking
|
The following functions check whether pgpro_ilm rules apply to relations and perform the needed action with the relations to which the rules apply. These functions apply no rules to partitioned tables and indexes on partitioned tables. For regular tables, the functions check the rules in the descending order of the period after which the rule gets activated. For partitions of partitioned tables, rules are checked for parent partitions from the lowest level of the partitioned table hierarchy to the highest level, and on the same level, rules are checked in the descending order of the period. Once a rule that applies to a relation is found, the functions perform the appropriate action, and finish processing of this relation.
If a rule to move a relation applies to a table, it is moved together with its indexes for which no rules are explicitly defined. However, if rules are defined for an index, they are processed independently of the table.
process_rules( iv_object_name text ) returns void
Checks whether any rule applies to a relation, performs the needed action with the relation if the rule applies to it, and exits. When this function is called for a relation more than once, pgpro_ilm determines whether a certain rule has already been applied and does not apply the rule again.
Arguments:
iv_object_name — name of the
relation. Can be qualified or not. If the name is unqualified,
the schema will be determined automatically by following
a search path (as explained in Section 5.10.4)
when iterating through the rules.
process_all_rules() returns void
For each relation in the database, checks whether any rule applies
to this relation, performs the needed action with the relation
if the rule applies to it, and proceeds to the next relation. The
function actually calls process_rules for
all the relations in the database for which any rule exists.
The following example illustrates usage of pgpro_ilm:
Let's create and view the user exception list:
test=# SELECT pgpro_ilm.set_exclude_users(array['exclude_access_user'], array['Exclude Write User']);
set_exclude_users
-------------------
(1 row)
test=# SELECT user_name, exclude_access, exclude_modification FROM pgpro_ilm.get_exclude_users();
user_name | exclude_access | exclude_modification
---------------------+----------------+----------------------
exclude_access_user | t | f
Exclude Write User | f | t
(2 rows)
Consider a partitioned table with one partition:
test=# \d+ measurement
Partitioned table "public.measurement"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
city_id | integer | | not null | | plain | | |
logdate | date | | not null | | plain | | |
peaktemp | integer | | not null | | plain | | |
unitsales | integer | | | | plain | | |
Partition key: RANGE (logdate)
Indexes:
"measurement_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp)
"measurement_city_id_idx" btree (city_id)
"measurement_logdate_idx" btree (logdate)
Partitions: measurement_y2006m03 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
test=# \d+ measurement_y2006m03
Table "public.measurement_y2006m03"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
city_id | integer | | not null | | plain | | |
logdate | date | | not null | | plain | | |
peaktemp | integer | | not null | | plain | | |
unitsales | integer | | | | plain | | |
Partition of: measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-03-01'::date) AND (logdate < '2006-04-01'::date))
Indexes:
"measurement_y2006m03_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp)
"measurement_y2006m03_city_id_idx" btree (city_id)
"measurement_y2006m03_logdate_idx" btree (logdate)
Access method: heap
Let's add a few rules:
test=# SELECT * from pgpro_ilm.add_rule('measurement', 'NO_ACCESS', interval '16 mons', 'ALTER_TS', 'slow_space');
add_rule
----------
(1 row)
test=# SELECT * from pgpro_ilm.add_rule('measurement', 'NO_MODIFICATION', interval '16 mons', 'ALTER_TS', 'slow_space');
add_rule
----------
(1 row)
test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_ACCESS', interval '14 mons', 'ALTER_TS', 'slow_space');
add_rule
----------
(1 row)
test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_ACCESS', interval '12 mons', 'ALTER_TS', 'slow_space');
add_rule
----------
(1 row)
Rules for the partitioned table look as follows:
test=# select * from pgpro_ilm.get_rules('measurement');
nspname | relname | rule_type | period | action | parameter
---------+-------------+-----------------+---------------+----------+------------
public | measurement | NO_ACCESS | 1 year 4 mons | ALTER_TS | slow_space
public | measurement | NO_MODIFICATION | 1 year 4 mons | ALTER_TS | slow_space
(2 rows)
Two explicitly specified NO_ACCESS rules and a NO_MODIFICATION rule inherited from the parent table apply to the partition:
test=# select * from pgpro_ilm.get_rules('measurement_y2006m03');
nspname | relname | rule_type | period | action | parameter
---------+----------------------+-----------------+---------------+----------+------------
public | measurement | NO_MODIFICATION | 1 year 4 mons | ALTER_TS | slow_space
public | measurement_y2006m03 | NO_ACCESS | 1 year 2 mons | ALTER_TS | slow_space
public | measurement_y2006m03 | NO_ACCESS | 1 year | ALTER_TS | slow_space
(3 rows)
Let's add one more explicit rule for the partition:
test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_MODIFICATION', interval '12 mons', 'ALTER_TS', 'slow_space');
add_rule
----------
(1 row)
Now only explicitly specified rules apply to the partition:
test=# select * from pgpro_ilm.get_rules('measurement_y2006m03');
nspname | relname | rule_type | period | action | parameter
---------+----------------------+-----------------+---------------+----------+------------
public | measurement_y2006m03 | NO_ACCESS | 1 year 2 mons | ALTER_TS | slow_space
public | measurement_y2006m03 | NO_ACCESS | 1 year | ALTER_TS | slow_space
public | measurement_y2006m03 | NO_MODIFICATION | 1 year | ALTER_TS | slow_space
(3 rows)
Note that rules are not processed for partitioned tables.
test=# SELECT pgpro_ilm.process_rules('measurement');
NOTICE: ILM: no rules matched for 'public.measurement'
process_rules
---------------
(1 row)
When rules are processed, indexes are moved together with the table:
test=# SELECT pgpro_ilm.process_rules('measurement_y2006m03');
NOTICE: ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space'
NOTICE: ILM: executing 'ALTER TABLE public.measurement_y2006m03 SET TABLESPACE slow_space;'
NOTICE: ILM: executing 'ALTER INDEX public.measurement_y2006m03_pkey SET TABLESPACE slow_space;'
NOTICE: ILM: executing 'ALTER INDEX public.measurement_y2006m03_city_id_idx SET TABLESPACE slow_space;'
NOTICE: ILM: executing 'ALTER INDEX public.measurement_y2006m03_logdate_idx SET TABLESPACE slow_space;'
process_rules
---------------
(1 row)
Result:
test=# \d+ measurement_y2006m03
Table "public.measurement_y2006m03"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
city_id | integer | | not null | | plain | | |
logdate | date | | not null | | plain | | |
peaktemp | integer | | not null | | plain | | |
unitsales | integer | | | | plain | | |
Partition of: measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-03-01'::date) AND (logdate < '2006-04-01'::date))
Indexes:
"measurement_y2006m03_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp), tablespace "slow_space"
"measurement_y2006m03_city_id_idx" btree (city_id), tablespace "slow_space"
"measurement_y2006m03_logdate_idx" btree (logdate), tablespace "slow_space"
Tablespace: "slow_space"
Access method: heap
When the function is called again, pgpro_ilm determines whether the table has already been moved and does not move it again.
test=# SELECT pgpro_ilm.process_rules('measurement_y2006m03');
NOTICE: ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space'
NOTICE: ILM: 'public.measurement_y2006m03' is already in the 'slow_space' tablespace
process_rules
---------------
(1 row)
Removing all types of rules for the measurement table:
test=# select pgpro_ilm.remove_rule('measurement', NULL, '16 month'::interval, 'ALTER_TS');
remove_rule
-------------
(1 row)
test=# select * from pgpro_ilm.get_rules('measurement');
nspname | relname | rule_type | period | action | parameter
---------+---------+-----------+--------+--------+-----------
(0 rows)
To process all tables, pgpro_ilm.process_all_rules() function can be called. In this example, one already moved table is found:
test=# call pgpro_ilm.process_all_rules(); NOTICE: ILM: checking 'public.measurement_y2006m03' NOTICE: ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space' NOTICE: ILM: 'public.measurement_y2006m03' is already in the 'slow_space' tablespace CALL