pgpro_scheduler is a built-in Postgres Pro Enterprise extension for scheduling, monitoring, and
managing job execution within the Postgres Pro Enterprise database. With
pgpro_scheduler, you can:
Set advanced schedules using jsonb objects or crontab strings.
Dynamically calculate the next execution time for repeated jobs.
Execute SQL commands of the job in a single transaction or in sequential transactions, if required.
Submit jobs for immediate or delayed one-time execution in parallel with the scheduled jobs.
Unlike external scheduling daemons,
pgpro_scheduler offers the following benefits:
Any user can schedule jobs independently.
Job scheduling can be managed on the fly without restarting the database.
Scheduling is very lightweight since
pgpro_scheduler uses background workers to
schedule, monitor, and manage job execution. At the same time,
pgpro_scheduler does not require any client connections for scheduling.
For enhanced stability, each database has its own supervisor scheduler, with each scheduled job executed by a separate background worker.
pgpro_scheduler waits in the suspended state on
a standby server to be started when the standby is promoted to a primary
server.
Note that for all the executed jobs, the pg_stat_activity view will still show the name of the database superuser, which is used by the background worker.
The pgpro_scheduler extension is included into
Postgres Pro Enterprise. Once you have
Postgres Pro Enterprise installed, complete the
following steps to enable pgpro_scheduler:
Add pgpro_scheduler to the
shared_preload_libraries parameter
in the postgresql.conf file:
shared_preload_libraries = 'pgpro_scheduler'
Create the pgpro_scheduler extension using the following query:
CREATE EXTENSION pgpro_scheduler;
Make sure to create the pgpro_scheduler extension for each database you are planning to use.
Once you complete the installation and setup, configure
pgpro_scheduler for your database.
You must have superuser rights to configure pgpro_scheduler.
To configure pgpro_scheduler, modify the following settings
in the postgresql.conf file:
Specify the names of the databases for which you need to schedule jobs, in the comma-separated format:
schedule.database= 'database1,database2'
To control the workload in your system, set the maximum number of background workers that can run simultaneously on each database:
schedule.max_workers = 5
Optionally, set the number of background workers available for one-time job execution:
schedule.max_parallel_workers = 3
By default, two background workers for one-time jobs are available. These workers are not included into the schedule.max_workers number. Thus, one-time jobs can run in parallel with the scheduled jobs even if all the schedule.max_workers workers are busy.
Run pg_reload_conf() for the changes to
take effect:
SELECT pg_reload_conf();
When setting the
schedule.max_workers,
schedule.max_parallel_workers, and
schedule.database variables, make sure that enough
background workers remain available from the total pool of workers
established by max_worker_processes. Other
Postgres Pro subsystems may also use background
workers.
For detailed instructions on calculating the required number of background workers and for configuration examples, see Section F.54.3.
You can also dynamically configure pgpro_scheduler from the command line. In this case, you can set different number of workers for different databases:
ALTER SYSTEM SETschedule.database= 'database1,database2'; ALTER DATABASEdatabase1SETschedule.max_workers= 5; ALTER DATABASEdatabase2SETschedule.max_workers= 3; ALTER SYSTEM SETschedule.max_parallel_workers= 3; SELECTpg_reload_conf();
Once pgpro_scheduler is configured, enable
it on your system, as follows:
SELECT schedule.enable();
If this function returns true, pgpro_scheduler
is ready to use, and you can start scheduling jobs as explained in
Section F.54.4.1
and Section F.54.4.2.
If you restart the server, pgpro_scheduler
is not automatically restarted by default. To change this behavior,
you can set the schedule.auto_enabled
parameter to on.
See Also
To calculate the maximum number of background workers used by the
pgpro_scheduler extension, use the following formula:
1 + N * (1 + schedule.max_workers + schedule.max_parallel_workers)
1: The global supervisor worker
for the pgpro_scheduler extension.
N: The number of databases listed in the
schedule.database parameter. Each database runs
its own database manager and other background
workers.
schedule.max_workers: The maximum number of
background workers for scheduled jobs.
schedule.max_parallel_workers: The maximum number
of background workers for one-time jobs.
Note that this formula calculates the total number of background workers
required for the general configuration based on the global default
values set using ALTER SYSTEM. To optimize
performance and to meet task-specific requirements, tune the
schedule.max_workers and
schedule.max_parallel_workers parameters for each
database listed in schedule.database and include
these values in the total worker count.
The max_worker_processes parameter must provide worker
processes for both pgpro_scheduler and other
Postgres Pro subsystems. Before using
pgpro_scheduler for the first time, increase
max_worker_processes by the value obtained from the
formula above. Adjust this parameter whenever you modify the
pgpro_scheduler configuration.
For example, if you work with two databases and set
schedule.max_workers to 5 and
schedule.max_parallel_workers to 3,
pgpro_scheduler may use up to
1 + 2 * (1 + 5 + 3) = 19 background workers. Consequently,
you should increase the max_worker_processes value by
19.
Consider a more complex example: you decide to enable the extension for a third database. You configure the parameters as follows:
For the third database, set schedule.max_parallel_workers
to 2 and keep its schedule.max_workers set to 5
(default).
For the second database, set schedule.max_workers
to 2.
In this case, the total number of workers required for
pgpro_scheduler would be:
1 + (1 + 5 + 3) + (1 + 2 + 3) + (1 + 5 + 2) = 24
You would also need to increase the max_worker_processes
value by additional 5 workers (24 - 19 = 5).
If all background workers within this pool are busy, jobs will wait for the next available worker. This can lead to delays in the execution of scheduler jobs. Scheduled and one-time jobs are placed into separate queues to manage their execution.
If required, you can later change the number of workers. To check the
extension status, use the schedule.status() function.
If you see jobs in the submitted state, verify
that enough background workers are allocated.
Changes to the schedule.max_workers and
schedule.max_parallel_workers parameters do not affect
the running jobs.
To create and schedule a job, run the
create_job() function that takes scheduling
options as a jsonb object:
schedule.create_job(options jsonb)
In the jsonb object, you must specify one or more SQL commands
in the commands key, and set the job schedule
with at least one of the following keys:
dates — a single date or an array of
dates, in the timestamp with time zone format
cron — a string, in the crontab
format. A traditional five-field crontab format is used. The
first field stands for minute, the second
— for hour, the third — for day of the month, the
fourth — for month, and the fifth — for day of the
week.
┌── minute (0 - 59) │ ┌─── hour (0 - 23) │ │ ┌─── day of the month (1 - 31) │ │ │ ┌──── month (1 - 12) │ │ │ │ ┌──── day of the week (0 - 6) (Sunday to Saturday) │ │ │ │ │ * * * * *
A six-field crontab format can be used alongside the
traditional five-field format. In this case the first
field stands for second. When you use the six-field
format and do not want to specify a second you have to put 0
in the first field.
Alternatively, the following keywords can be used instead of
a crontab string to specify when the job will be started:
@every_second — each second
@hourly — at the beginning of each hour
@daily — at the beginning of each day
@midnight — at the beginning of each day
@weekly — at the beginning of each week
@monthly — at the beginning of each month
@yearly — at the beginning of each year
@annually — at the beginning of each year
rule — a jsonb object that includes one
or more of the following keys:
seconds — seconds; an array of integers
in range [0, 59]
minutes — minutes; an array of integers
in range [0, 59]
hours — hours; an array of integers in
range [0, 23]
days — days of the month; an array of
integers in range [1, 31]
months — months; an array of integers in
range [1, 12]
wdays — days of the week; an array of
integers in range [0, 6], where 0 is Sunday.
onstart — integer value 0 or 1. If
onstart is set to 1, the job is executed
only once when pgpro_scheduler is
started.
You can combine dates,
cron, and rule scheduling
keys for advanced use cases.
As a result, pgpro_scheduler creates an
active job with the specified schedule and returns the job ID.
For simple job schedules, you can use the following shortcut syntax:
schedule.create_job(cron,commands) schedule.create_job(dates,commands)
For details, see schedule.create_job() function description.
If required, you can later modify one or more scheduling options
with the set_job_attribute() or
set_job_attributes() functions, respectively.
If all background workers are busy at the specified time, the
job waits for the next available worker. By default, the
job can wait forever. You can
limit the maximum wait time by setting the
last_start_available key, in the time interval format.
If the timeout is reached, pgpro_scheduler
cancels the job execution.
Examples:
To run the job every day at 3pm, and, additionally, on December 31, 2017 at 7pm , and on April 4, 2020 at 1pm:
SELECT schedule.create_job('{"commands": "SELECT 15", "cron": "0 15 * * *", "dates": [ "2017-12-31 19:00", "2020-04-04 13:00" ]}');
To limit the wait time for job execution to 30 seconds after the scheduled time:
SELECT schedule.create_job('{"commands": "SELECT pg_sleep(100)", "cron": "15 */2 * * *", "last_start_available": "30 seconds" }');
Both for scheduled and one-time jobs, you cannot
manage their main transactions, namely
using COMMIT and
ROLLBACK.
However, you can create and manage autonomous transactions.
In addition to the general schedule, you can specify the
timeframe during which the scheduled job can be executed. To
ensure that pgpro_scheduler only executes
the job within the specified time window, define the
start_date and end_date
keys, in the timestamp with time zone format. You can set one
of these keys only to limit the start or the end time,
respectively. If you define a time window for the job,
pgpro_scheduler will only schedule this job
within this time window. If the started job is incomplete when
the specified time window ends,
pgpro_scheduler completes the job and then
excludes the job from further scheduling.
Examples:
To start scheduling the job only after 11am on May 1, 2017:
SELECT schedule.create_job('{"commands": "SELECT now()", "cron": "2 17 * * *", "start_date": "2017-05-01 11:00" }');
To schedule the job in the timeframe from 11am on May 1 to 3pm on June 4, 2017:
SELECT schedule.create_job('{"commands": "SELECT now()", "cron": "2 17 * * *", "start_date": "2017-05-01 11:00", "end_date": "2017-06-04 15:00" }');
The commands key can have values of text
and array types. If you specify several SQL commands as text
separated by semicolons, the whole job is executed in a single
transaction. If it is critical to perform each SQL command in
a separate transaction, pass the SQL commands as an array. You
can modify this behavior by setting the
use_same_transaction key to true. In this
case, SQL commands in the array are executed in a single transaction.
Examples:
To run the whole job in a single transaction:
SELECT schedule.create_job('{"commands": "SELECT 1; SELECT 2; SELECT 3;", "cron": "23 23 */2 * *" }');
To run commands in separate transactions:
SELECT schedule.create_job('{"commands": [ "SELECT 1", "SELECT 2", "SELECT 3" ], "cron": "23 23 */2 * *" }');
To run the whole job in a single transaction when passing the commands as an array:
SELECT schedule.create_job('{"commands": [ "SELECT 1", "SELECT 2", "SELECT 3" ], "cron": "23 23 */2 * *", "use_same_transaction": true }');
For repeated jobs, the next start time can be computed by an SQL statement specified in the
next_time_statement key. In this case, the
first job starts on schedule, while all the successive job
runs occur at the computed times.
After the job run completes,
pgpro_scheduler executes the SQL statement
in the next_time_statement key to calculate
the next start time and returns the result, in the timestamp
with time zone type. If the return value is of a different
type or an error occurs, pgpro_scheduler
marks the job as broken and cancels any further execution.
This process is repeated for each successive job run.
When the job run
completes, pgpro_scheduler sets the
transaction state in the
schedule.transaction_state variable, in the
text format. You can use this variable in your
next_time_statement to dynamically
calculate the next start time depending on the transaction
state. At the time of the
next_time_statement execution, the
schedule.transaction_state variable must
contain either success or
failure state values for the main transaction.
Other values may indicate an internal
pgpro_scheduler error.
Examples:
To run the job first at 10:45, and then in a day after the job completes:
SELECT schedule.create_job('{"commands": "SELECT random()", "cron": "45 10 * * *", "next_time_statement": "SELECT now() + ''1 day''::interval" }');
The pgpro_scheduler extension enables you
to define additional conditions for task execution:
Set time limits for job execution with the
max_run_time key. If the execution time is
exceeded, pgpro_scheduler cancels the job.
Define the maximum time a scheduled job can wait for execution
using the last_start_available
key. If the timeout is reached, pgpro_scheduler cancels the job.
Schedule a job to be executed with the rights of another user by specifying the
run_as key. You must have superuser rights to use this key.
Define an SQL command to execute if the main command fails using the onrollback key.
Examples:
To limit job execution to 5 seconds:
SELECT schedule.create_job('{"commands": "SELECT pg_sleep(10)", "cron": "15 */10 * * *", "max_run_time": "5 seconds" }');
To limit the wait time for job execution to 30 seconds after the scheduled time:
SELECT schedule.create_job('{"commands": "SELECT pg_sleep(100)", "cron": "15 */2 * * *", "last_start_available": "30 seconds" }');
To start the job with the rights of the robot
user:
SELECT schedule.create_job('{"commands": "SELECT session_user", "cron": "5 */5 * * *", "run_as": "robot" }');
To define a fallback SQL command in case the main command fails:
SELECT schedule.create_job('{"commands": "SELECT ''zzz''", "cron": "55 */12 * * *", "onrollback": "SELECT ''Cannot select zzz''" }');
You can submit jobs for one-time execution using the schedule.submit_job() function. Such jobs use a separate pool of background workers defined by the schedule.max_parallel_workers variable, and can run in parallel with the scheduled jobs. By default, two one-time jobs can run concurrently. If you submit more jobs, they will wait in the queue for the next available background worker.
To execute a one-time job immediately, pass SQL commands in the query argument. For example:
schedule.submit_job(query := 'select 1');
Instead of passing SQL query parameters directly, you can define numbered placeholders in the query argument, such as $1 and $2, and pass an array of parameters in the params argument, with each array element corresponding to a placeholder. For brevity, you can omit the query and params names:
schedule.submit_job(query := 'select $1, $2', params := '{"text 1", "text 2"}')
To start a one-time job at the specified time, use the run_after argument:
schedule.submit_job('select ''flowers''', run_after := '2017-03-08 08:00:01');
Alternatively, you can delay the job start until the specified jobs are complete using the depends_on argument. For example, to run a job after completing the jobs with 23, 15, and 334 IDs, run:
schedule.submit_job('select ''well done''', depends_on := '{23, 15, 334}')
If required, you can repeat the job execution by passing the schedule.resubmit() function as part of the query argument. For example:
schedule.submit_job('select 1, schedule.resubmit(run_after := ''5'')');
The run_after argument specifies the time interval before the job is restarted, in seconds. By default, the interval is 1 second.
The resubmitted job cannot be executed more than the number of times set in the
resubmit_limit argument. If this limit is reached, the job receives the done status, with the corresponding error message.
If you want to cancel a resubmitted job, run:
schedule.cancel_job(job_idbigint);
To monitor one-time jobs, use the job_status and all_job_status pgpro_scheduler views.
For details on all the functions available for managing one-time jobs, see Section F.54.5.6.3.
When you create a new job with the
create_job() function, the job becomes active and
waits for execution based on the specified schedule. Using the
job ID returned by the create_job() function,
you can change the scheduling settings or remove the job from
the schedule. To change the specified schedule for
the jobs, use set_job_attribute() or
set_job_attributes() functions:
To modify a
single property of the job, run the
set_job_attribute() function with the job ID,
the property name to change, and the new value for this
property.
To modify more than one property of the job, run the
set_job_attributes() function instead. In
this case, you can specify all the job properties at once in a
jsonb object. For details on all the keys available for job
scheduling, see the create_job() function
description.
To temporarily exclude the job from scheduling, run the
deactivate_job() function:
schedule.deactivate_job(job_id integer)
You can re-activate the job later by running the
activate_job() function:
schedule.activate_job(job_id integer)
To permanently remove the job from the schedule, run the
drop_job() function:
schedule.drop_job(job_id integer)
You must have
superuser rights to monitor job execution for the whole
system. Otherwise, you can only monitor the jobs that you own.
To monitor scheduled jobs, pgpro_scheduler
provides multiple functions that return cron_rec or cron_job
records:
get_job() — retrieves information about
the job.
get_owned_cron() — retrieves the list of jobs
owned by user.
get_cron() — retrieves the list of jobs executed
by user.
get_active_jobs() — returns the list of
jobs executed at the moment of the function call.
get_log() — returns the list of all
completed jobs.
get_user_log() — returns list of the
completed jobs executed by the specified user.
clean_log() — deletes all records with
information about completed jobs.
To learn more about each function, see Section F.54.5.6.
pgpro_scheduler enables you to audit
job scheduling to rule out human error if you observe
unexpected changes in scheduled job execution.
By default, pgpro_scheduler does not store
information on schedule changes. To enable this feature, set the
schedule.enable_history parameter to true.
Once this parameter is enabled, pgpro_scheduler
stores schedule modifications in the schedule.cron__history
table, and logs all deleted jobs in the schedule.cron__deleted
table. Logged history is never deleted from these tables,
so a superuser can review schedule changes introduced by
all users at any time.
For details on logged information, see Section F.54.5.5.
Using pgpro_scheduler, you can manage
scheduled and one-time jobs on a cluster configured
with multimaster.
pgpro_scheduler can only manage jobs
on the node on which it is installed. Thus, you must
install and enable pgpro_scheduler
on all nodes on which you would like to schedule jobs.
pgpro_scheduler instances will manage
jobs on different nodes independently, but the executed jobs
will be replicated to other nodes.
Even if you are planning to schedule jobs on a single node only,
it is recommended to enable pgpro_scheduler
on several nodes. In this case, if a node with
scheduled jobs fails, another pgpro_scheduler
instance picks up these jobs.
If pgpro_scheduler is running on more than
one node, the node with the smallest node ID is selected.
The naming pattern of node IDs is defined by the
schedule.nodename GUC variable.
schedule.enabled (boolean)
#
Deprecated.
Specifies whether pgpro_scheduler is enabled on your system.
Default: false.
For pgpro_scheduler 2.5 or higher, you can
set the schedule.auto_enabled
parameter to control whether pgpro_scheduler
is enabled at the server start, or use
schedule.enable()/schedule.disable()
functions to enable/disable pgpro_scheduler on demand.
To check if pgpro_scheduler is currently running,
use the schedule.is_enabled() function.
schedule.auto_enabled (boolean)
#
Specifies whether to enable pgpro_scheduler at the server start.
Default: false.
schedule.database (text)
#
Specifies the databases for which pgpro_scheduler is enabled.
Database names must be separated by commas.
Default: empty string.
schedule.database_to_connect (text)
#
The database to which pgpro_scheduler gets connected
to receive Postgres Pro Enterprise cluster metadata.
The specified database cannot be dropped while pgpro_scheduler
is running. You can change this parameter only when restarting the server.
Default: postgres.
schedule.schema (text)
#Deprecated. Specifies the name of a schema where the scheduler stores its tables and functions. If you need to change the default schema, use ALTER EXTENSION.
Default: schedule.
schedule.nodename (text)
#
Specifies the name of the cluster node on which
pgpro_scheduler is running.
Do not change or use this variable if you run a single-server cluster configuration.
On a cluster configured with multimaster, the node name is
derived from the node ID provided by multimaster. For
example, if the node ID is 3, the schedule.nodename
variable is set to mtm-node-3. However, if you explicitly set
the schedule.nodename variable by editing the postgresql.conf
file or running the ALTER command, pgpro_scheduler
will ignore the node ID and use the provided value instead.
Default: primary.
schedule.max_workers (integer)
#Specifies the maximum number of simultaneously running scheduled jobs in one database.
Default: 2.
schedule.max_parallel_workers (integer)
#Specifies the maximum number of parallel threads that can be used for executing one-time jobs.
Default: 2.
schedule.transaction_state (text)
#
An internal variable containing the state of the
executed job. pgpro_scheduler uses this variable when
calculating the next job start time. Possible values are:
success — transaction has finished successfully.
failure — transaction has failed to finish.
running —
transaction is in progress.
undefined — transaction has
not started yet.
At the time of the
next_time_statement execution, the
schedule.transaction_state variable must
contain either success or
failure state values.
Other values may indicate an internal
pgpro_scheduler error.
schedule.enable_history (boolean)
#
Log all schedule changes, including the name of the
user who initiated the change and the time when this change occurred.
If a new job is added, or the schedule of an existing job is
modified, this information is stored in the schedule.cron__history
table. If a job is deleted, this information is stored in the
schedule.cron__deleted table.
If you later disable the schedule.enable_history parameter,
the history of the already recorded changes is preserved.
Default: false
To store its internal tables and functions,
pgpro_scheduler uses the
schedule SQL schema.
Direct access to tables is not recommended and should not be attempted.
To manage job scheduling, use the functions
defined by the pgpro_scheduler extension.
pgpro_scheduler defines the following types that
are used by some of the pgpro_scheduler functions.
This type contains information about the scheduled job.
CREATE TYPE schedule.cron_rec AS(
id integer, -- job ID
node text, -- name of the node
-- on which to execute the job
name text, -- job name
comments text, -- comments about the job
rule jsonb, -- scheduling rules
commands text[], -- SQL commands to be executed
run_as text, -- username of the job executor
owner text, -- username of the job owner
start_date timestamptz, -- lower bound of the execution window;
-- NULL if unbound
end_date timestamptz, -- upper bound of the execution window;
-- NULL if unbound
use_same_transaction boolean, -- true if an array of SQL
-- commands will be executed
-- in a single transaction
last_start_available interval, -- maximum wait time for
-- the scheduled job if all
-- allowed workers are busy
max_run_time interval, -- maximum execution time
onrollback text, -- SQL statement to execute
-- if the main transaction fails
max_instances int, -- maximum number of simultaneously
-- running job instances
next_time_statement text, -- SQL statement to calculate
-- the next start time
active boolean, -- true if job is scheduled
-- successfully
broken boolean -- true if job has errors in
-- configuration that prevented
-- its further execution
);
This type contains information about a particular job execution.
CREATE TYPE schedule.cron_job AS(
cron integer, -- job id
node text, -- name of the node
-- on which to execute the job
scheduled_at timestamptz, -- scheduled execution time
name text, -- job name
comments text, -- comments about the job
commands text[], -- SQL statement to be executed
run_as text, -- username of the job executor
owner text, -- username of the job owner
use_same_transaction boolean, -- true if an array of SQL
-- commands will be executed
-- in a single transaction
started timestamptz, -- timestamp of the job execution start
last_start_available timestamp, -- maximum wait time for
-- the scheduled job if all
-- allowed workers are busy
finished timestamptz, -- timestamp of the job
-- execution finish
max_run_time interval, -- maximum execution time
onrollback text, -- SQL statement to execute if the main
-- transaction fails
next_time_statement text, -- SQL statement to calculate
-- the next start time
max_instances int, -- the number of simultaneously
-- running job instances
status job_status_t, -- status of the task:
-- working, done, or error
message text -- error message
);
Enumerated type. Can take the following values:
working — the job is being executed.
done — job execution is complete.
error — job execution has failed.
Enumerated type. Can take the following values:
submitted — the job is submitted into the queue, but the execution has not started yet.
processing — the job is being executed.
done — job execution is complete.
Enumerated type. Can take the following values:
periodical — a scheduled job.
onetime — a one-time job.
Enumerated type. Can take the following values:
inprogress — the job is being executed.
done — job execution is complete.
error — job execution has failed.
submitted — the job is submitted into the
queue, but the execution has not started yet.
pgpro_scheduler provides several views for monitoring execution status of one-time jobs.
Shows the status of one-time jobs belonging to the current user.
Table F.36. job_status View
| Column Name | Column Type | Description |
|---|---|---|
id
|
bigint
| Job ID. |
node
|
text
| Name of the node on which the job is being executed. |
name
|
text
| Name of the job. |
comments
|
text
| Comments about the job. |
run_after
|
timestamp with time zone
| Timestamp after which the job execution must start. |
query
|
text
| SQL commands executed by the job. |
params
|
text[]
| An array of parameters for the SQL query. |
depends_on
|
bigint[]
| An array of job IDs on which the job execution depends. |
run_as
|
text
| User or role whose rights are used to execute the job. |
attempt
|
bigint
| The number of execution attempts. |
resubmit_limit
|
bigint
| The maximum number of allowed job resubmissions. |
max_wait_interval
|
interval
| The maximum time interval to postpone the job execution if all background workers are busy at the scheduled moment. |
max_duration
|
interval
| Time interval during which the job can be executed. |
submit_time
|
timestamp with time zone
| Time when the job was submitted to the execution queue. |
canceled
|
boolean
| Specifies whether the job was canceled by user. |
start_time
|
timestamp with time zone
| Job execution start time. |
is_success
|
boolean
|
|
error
|
text
| Error message. |
done_time
|
timestamp with time zone
| Time when the job execution completed. |
status
|
job_at_status_t
| Job status. See the Section F.54.5.3.4 for details. |
Shows the status of all one-time jobs. You must have superuser rights to access this view.
Table F.37. all_job_status View
| Column Name | Column Type | Description |
|---|---|---|
id
|
bigint
| Job ID. |
node
|
text
| Name of the node on which the job is being executed. |
name
|
text
| Name of the job. |
comments
|
text
| Comments about the job. |
run_after
|
timestamp with time zone
| Timestamp after which the job execution must start. |
query
|
text
| SQL commands executed by the job. |
params
|
text[]
| An array of parameters for the SQL query. |
depends_on
|
bigint[]
| An array of job IDs on which the job execution depends. |
run_as
|
text
| User or role whose rights are used to execute the job. |
owner
|
text
| The user who created the job. |
attempt
|
bigint
| The number of execution attempts. |
resubmit_limit
|
bigint
| The maximum number of allowed job resubmissions. |
max_wait_interval
|
interval
| The maximum time interval to postpone the job execution for if all background workers are busy at the scheduled moment. |
max_duration
|
interval
| Time interval during which the job can be executed. |
submit_time
|
timestamp with time zone
| Time when the job was submitted to the execution queue. |
canceled
|
boolean
| Specifies whether the job was canceled by user. |
start_time
|
timestamp with time zone
| Job execution start time. |
is_success
|
boolean
|
|
error
|
text
| Error message. |
done_time
|
timestamp with time zone
| Time when the job execution completed. |
status
|
job_at_status_t
| Job status. See the Section F.54.5.3.4 for details. |
The following tables store all schedule changes if the
schedule.enable_history parameter is set to true.
If you later disable the schedule.enable_history parameter,
the history of the already recorded changes is preserved.
Registers job scheduling changes. Whenever a new job is scheduled, or the schedule for an existing job is changed, a new row is inserted into this table to record the following information:
All details about the scheduled job, as defined by the
cron_rec data type. For details on the
cron_rec type, see Section F.54.5.3.
submitter — name of the user who updated the schedule.
version_id — a unique ID for each registered change
in the schedule.
submit_time — time when the schedule was updated.
Registers all jobs that were removed from the schedule:
cron — ID of the deleted job.
submitter — name of the user who deleted the job.
submit_time — time when the job was deleted.
pgpro_scheduler provides two separate
sets of functions for managing scheduled and one-time jobs,
as well as several common functions that can toggle
pgpro_scheduler on and off for your database
and show the current status of the extension:
With each job, you can only use the function specifically tailored for this job type.
These functions facilitate pgpro_scheduler
management.
schedule.enable()
#
Enables pgpro_scheduler for the current
Postgres Pro Enterprise instance.
Return values:
true if pgpro_scheduler
is enabled and ready to use.
false if the command has failed.
schedule.is_enabled()
#
Checks whether pgpro_scheduler is enabled.
Return values:
true if pgpro_scheduler
is enabled and ready to use.
false if pgpro_scheduler
is not currently running.
schedule.disable()
#
Disables pgpro_scheduler for the current
Postgres Pro Enterprise instance.
Return values:
true if pgpro_scheduler
is disabled.
false if the command has failed.
schedule.start()
#
Launches pgpro_scheduler for the currently
connected database.
Return values:
true — pgpro_scheduler
started successfully.
false — if the command has failed, or
pgpro_scheduler is already started.
schedule.stop()
#
Stops pgpro_scheduler for the currently
connected database.
Return values:
true — pgpro_scheduler
is stopped.
false — if the command has failed, or
pgpro_scheduler is not running.
schedule.status()
#
Returns the status of pgpro_scheduler
background workers:
pid — process ID of the background worker.
If the process ID is NULL,
the background worker is not running.
database — name of the database to
which the background worker is connected.
type — type of the background worker:
supervisor — distributes the
scheduled jobs between the databases.
database manager distributes the
scheduled jobs within the database.
cron job executor executes a scheduled job.
at job executor executes a one-time job.
schedule.version()
#
Returns pgpro_scheduler version.
schedule.create_job(options jsonb)
#Creates an active job and returns the job ID.
Alternative Syntax:
schedule.create_job(crontext,commandstext[,nodetext]) schedule.create_job(crontext,commandstext[] [,nodetext]) schedule.create_job(datestimestamp with time zone,commandstext[,nodetext]) schedule.create_job(datestimestamp with time zone, commandstext[][,nodetext]) schedule.create_job(datestimestamp with time zone[],commandstext[,nodetext]) schedule.create_job(datestimestamp with time zone[],commandstext[][,nodetext])
Arguments:
options — a jsonb object defining all the job properties. You do not need to define other parameters if the data is set. All the available jsonb keys are listed in Table F.38.
Type: jsonb
cron — a crontab-like string defining the job schedule.
Type: text
dates — the exact date or an array of dates for job execution.
Type: timestamp with time zone, timestamp with time zone[]
commands — SQL statement to execute. You can pass one or more SQL
commands separated by semicolons, or an array of SQL
commands. When passed as an array, SQL commands are
executed in separate transactions.
Type: text, text[]
node — the name of the node on which the scheduled jobs
run. Optional. You may need to specify this argument if you are
scheduling jobs on a multi-master cluster.
Type: text
Return values:
ID of the created job.
Table F.38. jsonb Keys for Job Scheduling
| Key | Type | Description |
|---|---|---|
cron
|
text
|
A crontab-like string defining the job schedule.
A traditional five-field or nonstandard six-field (seconds
in the first field)
|
dates
|
timestamp with time zone, timestamp with time zone[]
|
The exact date or an array of dates when the scheduled
job will be executed. You can combine
dates with rule
and cron keys, but at least one of
them is mandatory.
|
rule
|
jsonb
|
A
|
commands
|
text, text[]
|
SQL statements to execute. You can pass one or more SQL
statements separated by semicolons, or an array of SQL
statements. When passed as an array, SQL statements are
executed in separate transactions by default. You can
change this behavior by setting the
use_same_transaction key.
|
name
|
text
| Optional. Job name. |
node
|
text
| Optional. The name of the node on which the scheduled jobs run. You may need to specify this argument if you are scheduling jobs on a multi-master cluster. |
comments
|
text
| Optional. Comments about the scheduled job. |
run_as
|
text
| Optional. The user whose rights are used to execute the job. |
start_date
|
timestamp with time zone
|
Optional. The start of the timeframe when the scheduled
job can be executed. This key can be NULL.
|
end_date
|
timestamp with time zone
|
Optional. The end of the timeframe when the scheduled
job can be executed. This key can be NULL.
|
use_same_transaction
| boolean |
Optional. If set to true, forces an array of SQL
statements to be executed in a single transaction.
Default: false
|
last_start_available
|
interval
|
Optional. The maximum time interval to postpone the job
execution for if all background workers are
busy at the scheduled moment. For example, if this
key is set to '00:02:34', the job will wait for 2
minutes 34 seconds. If this key is NULL or not set, the
job can wait forever. Default: NULL.
|
max_instances
|
integer
| Optional. The maximum number of job instances that can be executed simultaneously. Default: 1. |
max_run_time
|
interval
|
Optional. The maximum time interval during which the
scheduled job can be executed. If this key is NULL or
not set, there are no time limits. Default: NULL.
|
onrollback
|
text
| Optional. SQL statement to be executed if the main transaction fails. |
next_time_statement
|
text
| Optional. SQL statement to calculate the start time for the next job execution. For details, see Section F.54.4.1.3. |
schedule.set_job_attributes(job_id integer,
data jsonb)
#Updates properties of the existing job.
Arguments:
job_id — identifier of the existing job.
data — a jsonb object with properties to be edited.
For the list of keys and their structure, see Table F.38.
Return values:
true — job properties were updated successfully.
false — job properties were not updated.
To update the job properties, you must be the owner of the job or have superuser rights.
schedule.set_job_attribute(job_id integer,
name text,
value text || anyarray)
#Updates a property of the existing job.
Arguments:
job_id — identifier of the existing job.
name — property name.
value — property value.
See Table F.38 for the list of job properties you can update. Some values are of array types. They should be passed as an array. If a value of a wrong type is passed, an exception is raised.
Return values:
true — job property was updated successfully.
false — job property was not updated.
To update the job properties, you must be the owner of the job or have superuser rights.
schedule.deactivate_job(job_id integer)
#Deactivates the job and suspends its further scheduling and execution.
Arguments:
job_id — identifier of the existing job.
Return values:
true — the job is deactivated successfully.
false — job deactivation failed.
schedule.activate_job(job_id integer)
#Activates a job and starts its scheduling and execution.
Arguments:
job_id — identifier of the existing job.
Return values:
true — the job was activated successfully.
false — job activation failed.
schedule.drop_job(job_id integer)
#Deletes a job.
Arguments:
job_id — identifier of the existing job.
Return values:
true — the job was deleted successfully.
false — job was not deleted.
schedule.get_job(job_id integer)
#Returns information about the specified job.
Arguments:
job_id — identifier of the existing job.
Return values:
An object of type cron_rec.
For details on the cron_rec type, see Section F.54.5.3.
schedule.get_owned_cron(username text)
#Retrieves the list of jobs owned by the specified user.
Arguments:
username — username, optional.
Return values:
A set of records of type cron_rec.
These records contain information about all jobs owned by the specified user.
If the username is omitted, the session username is used.
You must have superuser rights to retrieve jobs owned by another user.
For details on the cron_rec type, see Section F.54.5.3.
schedule.get_cron()
#Retrieves the list of jobs executed by the session user.
Return values:
A set of records of type cron_rec.
These records contain information about all jobs executed by the
session user. You must have superuser rights to retrieve the jobs.
For details on the cron_rec type, see Section F.54.5.3.
schedule.get_active_jobs(username text)
#Returns the list of jobs currently being executed by the specified user.
Arguments:
username — username, optional.
If username is omitted, the session username is used.
You must have superuser rights to retrieve jobs executed by another user.
Return values:
A set of records of type cron_job.
For details on the cron_job type, see Section F.54.5.3.
schedule.get_active_jobs()
#Returns the list of jobs being currently executed. You must have superuser rights to call this function.
Return values:
A set of records of type cron_job.
For details on the cron_job type, see Section F.54.5.3.
schedule.get_log()
#Returns the list of all completed jobs. You must have superuser rights to call this function.
Return values:
A set of records of type cron_job.
For details on the cron_job type, see Section F.54.5.3.
schedule.get_user_log(username text)
#Returns the list of completed jobs executed by the specified user.
Arguments:
username — username, optional.
If username is omitted, the session username is used.
You must have superuser rights to retrieve the list of jobs executed by another user.
Return values:
A set of records of type cron_job.
For details on the cron_job type, see Section F.54.5.3.
schedule.clean_log()
#Deletes all records with information about the completed jobs. You must have superuser rights to call this function.
Return values:
The number of records deleted.
schedule.nodename()
#Returns the current node name.
schedule.submit_job(query text [options...])
#
Submits a job for immediate or delayed one-time execution.
By default, the job is scheduled for immediate execution and can run in parallel with other scheduled jobs. To submit a job with a delayed start, you can set the execution start time using the run_after argument, or pass an array of job IDs in the depends_on argument to schedule job execution right after these jobs are complete.
Arguments:
query — SQL commands to execute.
Type: text
params — an array of parameters for the SQL query that can
substitute numbered placeholders in the query argument, such as $1, $2, etc. Default: NULL
Type: text[]
run_after — a timestamp after which the
job execution starts. If this argument is set to NULL, the job is scheduled for immediate execution. You can also use the depends_on argument to delay the job start. Default: NULL
Type: timestamp with time zone
node — the name of the node on which
to execute the job. Default: NULL
Type: text
max_duration — the maximum time interval during which the job can be executed. If this time is exceeded, the job is forced
to stop. If this argument is NULL or not set, there are no time limits. Default: NULL
Type: interval
max_wait_interval — the maximum time interval to postpone the job
execution for if all background workers are busy
at the scheduled moment. For example, if this
key is set to '00:02:34', the job will wait for 2
minutes 34 seconds. If this key is NULL or not set, the
job can wait forever.
Default: NULL
Type: interval
run_as — user or role
whose rights are used to execute the job. If run_as is set
to NULL, the job is executed with the rights of the current user. You must have superuser rights to set this argument.
Default: NULL
Type: text
depends_on — an array of job IDs. The created job
starts immediately after the specified jobs complete the execution.
This argument is an alternative to run_after. Default: NULL
Type: bigint[]
name — name of the job. Default: NULL
Type: text
comments — comments about the job.
Type: text
resubmit_limit — maximum number of times
the job can be resubmitted for execution. See the
schedule.resubmit() function for details. Default: 100
Type: bigint
Return values:
ID of the created job.
Type: bigint
schedule.get_self_id()
#
Returns the ID of the job, in the context of which it was called.
The returned ID is of the bigint type. This function must be called inside the query of the schedule.submit_job() function. Otherwise, an exception is raised.
Return values:
Job ID.
schedule.cancel_job(job_id bigint)
#Cancels all subsequent runs of the specified job. If the job is currently being executed, it will not be interrupted, but cannot be resubmitted. You must have superuser rights or be the owner of the job to call this function.
Arguments:
job_id — identifier of the job to cancel.
Return values:
true if the operation completed successfully.
false if the operation failed.
schedule.resubmit(run_after interval default NULL)
#
Sets the start time for the next execution of the job, without interrupting the current job run.
This function must be called inside the query argument of the schedule.submit_job() function. Otherwise, an exception is raised. If this function is called several times within a single job execution, only the last function call is taken into account.
Arguments:
run_after — time interval after which the
job will be resubmitted for execution. If the time interval is
less than a second but greater than zero, it is rounded to
1 second. Intervals longer than 1 second are rounded to integral
values. If 0 is passed, the job is resubmitted immediately after
execution. Default: 1 second
Type: interval
Return values:
The number of seconds after which the job will be resubmitted for execution.
When many one-time jobs are submitted, the schedule.at_jobs_done
table may grow rapidly. To clean up this table,
pgpro_scheduler provides the following functions
with the same name but different argument types:
schedule.clean_at_jobs_done(older_than
interval, delete_failed_tasks
boolean, default NULL)
schedule.clean_at_jobs_done(older_than
timestamp with time zone,
delete_failed_tasks
boolean, default NULL)
Delete all records in schedule.at_jobs_done
that are older than either a specified timestamp or a specified
interval relative to the current time.
Arguments:
older_than — an interval or a
timestamp for cleanup. Records older than this value will be deleted.
Type: interval, timestamp with time zone
delete_failed_tasks — specifies
whether to delete failed jobs. Default: false.
Type: boolean
Return values:
The number of deleted records.
These functions can be called manually or scheduled using cron jobs in the pgpro_scheduler interface.
schedule.timetable(start_time timestamp with time zone,
end_timetimestamp with time zone)
#Returns a table, which describes all the jobs, both repeated and one-time, that are scheduled to execute within the specified time interval.
Table F.39. schedule.timetable Columns
| Column Name | Column Type | Description |
|---|---|---|
id
|
bigint
| Job ID, which is unique for the jobs of this type. |
type
|
timetable_job_type_t
| Job type. See Section F.54.5.3.5 for details. |
node
|
text
| Name of the node on which the job is being executed. |
name
|
text
| Name of the job. |
comments
|
text
| Comments about the job. |
commands
|
text[]
| An array of SQL commands executed by the job. |
scheduled_at
|
timestamp with time zone
| Scheduled job execution time. |
start_time
|
timestamp with time zone
| Job execution start time. |
done_time
|
timestamp with time zone
| Time when the job execution completed. |
status
|
timetable_job_status_t
| Job status. See Section F.54.5.3.6 for details. |
error
|
text
| Error message. |
Postgres Professional, Moscow, Russia