NODE_ERROR State
You can change the cluster composition by adding or removing nodes.
To add a node, use the bihactl add
command with the relevant options. To remove a node, use the
biha.remove_node function. For more
information on how to set up a high-availability cluster, see
Section 27.2.
Some configuration parameters must have the same value on all cluster nodes and can only be set up by special functions. For example, use the biha.set_heartbeat_max_lost function to set the biha.heartbeat_max_lost parameter value:
SELECT biha.set_heartbeat_max_lost(7);
All available functions for setting up cluster configuration parameters are listed in Section 27.4.2.1.
The parameters that can vary on different cluster nodes can be set with the
ALTER SYSTEM command. For example, you can change the
biha.can_vote parameter with
ALTER SYSTEM:
ALTER SYSTEM SET biha.can_vote = true; SELECT pg_reload_conf();
For more information about parameters and the ways they can be set up, see biha.
In addition to the built-in failover capabilities, the high-availability cluster in Postgres Pro allows for the switchover. The difference between failover and switchover is that the former is performed automatically when the leader node fails and the latter is done manually by the system administrator. To switch over the leader node, use the biha.set_leader function. When you set the new leader, the following happens:
All attempts to perform elections are blocked and the timeout is set.
The current leader node becomes the follower node.
The newly selected node becomes the new leader.
If the switchover process does not end within the established timeout, the selected node becomes the follower and new elections are performed to choose the new cluster leader.
You can enable or disable SSL to secure service connections in the initialized BiHA cluster by means of the biha.use_ssl configuration parameter.
To enable SSL, on all cluster nodes, set biha.use_ssl
to true:
ALTER SYSTEM SET biha.use_ssl = true;
To disable SSL, set the value to false.
Restart the nodes using pg_ctl.
Since enabling SSL in a BiHA cluster involves using the standard TLS handshake process, it is recommended to minimize the time between restarts of cluster nodes.
When you initialize the high-availability cluster, the biha_db
database is created as well as the biha extension
is created in the biha scheme of the
biha_db database. Besides, the following roles are created
and used:
BIHA_CLUSTER_MANAGEMENT_ROLE allows execution of all
biha extension functions.
BIHA_REPLICATION_ROLE is used when running
pg_rewind and pg_probackup.
biha_replication_user is a member of the
BIHA_REPLICATION_ROLE and BIHA_CLUSTER_MANAGEMENT_ROLE roles
and can request both client and replication connections.
It is used by the bihactl utility as well as when the follower node
is connected to the leader node. This role owns
the biha_db database.
The password for the biha_replication_user
role in the password file must be
the same on all nodes of the BiHA cluster.
The password for the role is prompted upon the BiHA
cluster creation.
biha_callbacks_user is the default user for
callback execution.
This user can connect to a database but has no privileges.
The predefined pg_monitor
is used to monitor the state of the BiHA cluster.
NODE_ERROR State #Errors occurred in biha or server instance processes listed below may cause the node failure, i.e. it will not be able to restart and will damage the WAL:
A rewind automatically performed by biha using pg_rewind if node timelines diverge.
The walreceiver process in case of timeline divergence. The follower node WAL may be partially rewritten by the WAL received from the leader node.
If the NODE_ERROR state is caused by other issues,
for example, replication slot overflow, you can fix the error root cause
and call the biha.reset_node_error function to
reset the NODE_ERROR state.
When the node goes into the NODE_ERROR state, the WAL
recovery is paused and the walreceiver process is stopped. Reading
from the nodes in the NODE_ERROR state is prohibited. Besides, the
error details are saved to the biha.state file and
checked upon the node restart, so the node will go into the same state when
the biha-background-worker process is launched.
To restore the node from the NODE_ERROR state, take the
following steps:
Save the most recent files from the pg_wal
directory, since some of the files unique to this node will be
rewritten by pg_rewind.
To save biha configuration files,
run pg_rewind with the --biha
option.
The --biha option is essential for saving
biha configuration files.
Using pg_rewind in a BiHA
cluster without --biha option may cause cluster
configuration inconsistency.
If the rewind has been successful, information about the
NODE_ERROR state is deleted from the
biha.state file. Besides, when you specify the
connection string in the --source-server option of
pg_rewind, it also automatically saves this
string for the primary_conninfo configuration
parameter in the
postgresql.auto.conf
file. This is important for the node to continue restoring after the
restart and reach the consistency point, which is the number of the
last record in the source server WAL at the time of the rewind.
(Optional) If the node was offline for a long time, to prevent the risk
of data corruption and obsolete data reads, set the
biha.flw_ro parameter of the restored node to
off.
You can check the results of the rewind, i.e. the rewind state of cluster
nodes, in the rewind_state field of the
biha.state file. The field contains the
enum values, which are interpreted as follows:
Table 27.1. The Rewind State
| Value | Interpretation |
|---|---|
0 | The rewind is not required. |
1 | The server is stopped, the biha.autorewind configuration parameter was enabled, the rewind will be performed after the server restart. |
2 | The rewind failed. |
3 | The rewind was performed successfully. |
4 | The biha.autorewind configuration parameter was not enabled, and the rewind must be performed manually as described in Section 27.3.6. |
BiHA allows you to create a cluster with quorum-based synchronous replication.
To configure synchronous replication, use the
--sync-standbys option when executing
bihactl init. This will modify
the synchronous_standby_names parameter by specifying
the list of all nodes with the keyword ANY. For example,
when setting the --sync-standbys
value to 2 for a three-node cluster, synchronous_standby_names
looks as follows:
ANY 2 (biha_node_1,biha_node_2,biha_node_3)
Also, the synchronous_commit parameter is used with
the default value on. Note that you cannot select specific
nodes for synchronous replication, but only set the number of nodes that
must run synchronously by specifying this number in
the --sync-standbys option.
Other nodes in the cluster will be replicated asynchronously, since
streaming replication is asynchronous by default. For more information, see
Section 26.2.8.
You can relax synchronous replication restrictions to allow the leader node
to continue operation while some of the followers are temporary unavailable.
To enable relaxed synchronous replication, specify the minimum number of
synchronous follower nodes in the --sync-standbys-min option.
If set, the option also changes the MIN field value of the
synchronous_standby_names parameter accordingly. The
synchronous_standby_gap parameter remains unchanged and
keeps its default value — 0. For example,
when setting the --sync-standbys-min
value to 0 for a three-node cluster, synchronous_standby_names
looks as follows:
ANY 2 MIN 0 (biha_node_1,biha_node_2,biha_node_3)
You can set the --sync-standbys-min value when
initializing the BiHA cluster by means of the
bihactl init command, and modify
it later with the biha.set_sync_standbys_min function.
There are a few points that need to be taken into account with regard to
synchronous replication and the referee node.
If the --mode option is set to referee,
the referee does not participate in synchronous replication. When set to
referee_with_wal, the node can synchronously receive data.
This mode allows the cluster to continue to be available in the
2+1 configuration with --sync-standbys=1
if the follower node is down and the referee node starts confirming transactions
for the leader node.
The referee behavior depends on the synchronous_commit
parameter value.
Note that with this parameter set to remote_apply
the referee does not confirm transactions.
biha logs messages sent by its components, i.e.
the control channel and the node controller. The control channel is used to
exchange service information between the nodes and is marked as
BCP in the log. The node controller is the
biha core component, which is responsible for the
node operation logic and is marked as NC in the log.
You can determine the types of messages to be logged by setting the
appropriate logging level. biha supports both
standard Postgres Pro message severity levels and
the extension logging levels.
Postgres Pro message severity levels are used by biha in the following cases:
A biha process ends with an error
(ERROR and FATAL levels).
A biha component is not covered by any logging level of the extension.
A message should be logged when component logging is disabled. For
example, LOG level messages sent by the control
channel, which are displayed only when the component is successfully
initialized.
biha logging levels are mapped to the
Postgres Pro message severity levels. If you want
messages of the required level to appear in the log, the value set for this
level should correspond to the value in log_min_messages.
The extension logging levels can be configured by editing the
postgresql.biha.conf file directly or by setting
biha configuration parameters, which are
listed in Section 27.4.1.2.
For example, to display messages related to the control channel, except for
the debugging information, specify the following in the
postgresql.biha.conf configuration file or set the
relevant configuration parameters. In this case, it is assumed that
log_min_messages is set to LOG:
biha.BcpTransportLog_log_level = LOG biha.BcpTransportWarn_log_level = LOG biha.BcpTransportDetails_log_level = LOG
A callback is an SQL function that notifies users or external services about events in the BiHA cluster, for example, about election of a new leader or change of cluster configuration. As a user, you create an SQL function and register it as a callback. Under certain conditions, the biha extension calls this function.
Timely alerting helps external services to provide proper reaction to events in the BiHA cluster. For example, after receiving information about the leader change, the proxy server redirects traffic to the new leader.
The following callback types are available:
Table 27.2. Callback Types
| Name | Description |
|---|---|
CANDIDATE_TO_LEADER |
Called on the node elected as the new leader. Signature: my_callback(void) RETURNS void |
LEADER_TO_FOLLOWER |
Called on the old leader returned to the cluster after demotion. Signature: my_callback(void) RETURNS void |
LEADER_CHANGED |
Called on every node when the BiHA cluster leader changes. Signature: my_callback(id integer, host text, port integer) RETURNS void In this signature, biha passes the new leader details: ID, host name, and node port number. |
LEADER_STATE_IS_RW |
Called on the leader and other nodes when the leader
changes its state to Signature: my_callback(id integer, host text, port integer) RETURNS void In this signature, biha passes to the callback the leader details: ID, host name, and node port number. |
TERM_CHANGED |
Called on the node when its Signature: my_callback(old_term integer, new_term integer) RETURNS void
In this signature, biha passes the old
and the new |
OFFERED_TO_LEADER |
Called on the node manually set to leader when it is about to become the leader. Signature: my_callback(void) RETURNS void |
NODE_ADDED |
Called on every node when a new node is added to the BiHA cluster. Signature: my_callback(id integer, host text, port integer, mode integer) RETURNS void
In this signature, biha passes to the
callback the new node details, such as its host name,
node port number, and operation mode: |
NODE_REMOVED |
Called on every node when a node is removed from the BiHA cluster. Signature: my_callback(id integer) RETURNS void In this signature, biha passes to the callback the ID of the removed node. |
When an event happens, callbacks are executed in sequence. At this time, biha cannot change its state, for example, initiate elections.
If callback execution takes longer than the biha.callbacks_timeout value, biha stops callback execution and continues normal operation.
On clusters with asynchronous replication, the biha.register_callback function does not wait for all nodes to receive callbacks. This may lead to a situation where callbacks are present on the leader, but not present on the follower as it lags behind.
Normally, callbacks are not executed on the referee
in the referee mode. However, if you have registered
callbacks on the leader before adding the referee, callbacks may be
executed on the referee and cannot be removed from it.
Do not call biha functions inside of a callback as this can cause unexpected behavior.
To manage callbacks, you can perform the following actions:
register one or several callbacks for a single event
view the list of registered callbacks
unregister callbacks
Registering Callbacks
Write an SQL function and then register it as a callback using biha.register_callback.
In this example, you create several SQL functions in PL/pgSQL and register them as different callback types.
Ensure that the leader node of your BiHA cluster is
in the LEADER_RW state.
On the leader node, use psql and
connect to the biha_db database:
postgres=# \c biha_db
Create the following callback functions:
-- log the node term change
CREATE FUNCTION log_term_changed(old_term integer, new_term integer)
RETURNS void AS $$
BEGIN
RAISE LOG 'Callback: Term changed from % to %', old_term, new_term;
END;
$$ LANGUAGE plpgsql;
-- log the election of the new leader
CREATE FUNCTION log_leader_changed(id integer, host text, port integer)
RETURNS void AS $$
BEGIN
RAISE LOG 'Callback: New leader is % %:%', id, host, port;
END;
$$ LANGUAGE plpgsql;
-- log that the leader was demoted
CREATE FUNCTION log_leader_to_follower()
RETURNS void AS $$
BEGIN
RAISE LOG 'Callback: demote';
END;
$$ LANGUAGE plpgsql;
Register the created functions:
SELECT biha.register_callback('TERM_CHANGED', 'log_term_changed', 'biha_db');
SELECT biha.register_callback('LEADER_CHANGED', 'log_leader_changed', 'biha_db');
SELECT biha.register_callback('LEADER_TO_FOLLOWER', 'log_leader_to_follower', 'biha_db');
You can also specify the user on which behalf the callbacks are executed or determine the order of callbacks execution. For more information, see the description of the biha.register_callback function.
Viewing Callbacks
Registered callbacks are added to the biha.callbacks
table located in the biha_db database.
To view all registered callbacks:
On the leader node, use psql and
connect to the biha_db database:
postgres=# \c biha_db
Display the content of the biha.callbacks table:
SELECT * FROM biha.callbacks; 1 | log_term_changed 2 | log_leader_changed 3 | log_leader_to_follower (3 rows)
Unregistering Callbacks
An unregistered callback is deleted from
the biha.callbacks table.
Ensure that the leader node of your BiHA cluster is
in the LEADER_RW state.
On the leader node, use psql and
connect to the biha_db database:
postgres=# \c biha_db
Get an ID of the callback that you want to unregister, for example,
log_leader_changed:
SELECT id FROM biha.callbacks WHERE func = 'log_leader_changed';
The callback ID is returned, for example, 2.
Unregister the callback:
SELECT biha.unregister_callback(2);
The callback is now unregistered.
If your database instance was restored from one of the nodes of the BiHA cluster to a separate node and/or using Point-in-Time Recovery (PITR), there must be no connection between the restored node and the operating BiHA cluster nodes. To prevent the connection, take the following steps on the restored node before you start it:
Remove the include 'postgresql.biha.conf' include directive
from the postgresql.conf configuration file.
Ensure that biha is not present in the shared_preload_libraries of the postgresql.conf file and, if applicable, of the postgresql.auto.conf file.
If you want to add the restored node to the cluster, take the following steps:
On the restored node, manually configure streaming replication from the leader.
Synchronize the restored node with the leader.
Stop the restored node using pg_ctl:
pg_ctl stop -D restored_node_PGDATA_directory
Add the restored node to the cluster using the bihactl
add command with the --convert-standby option.
Start the restored node using pg_ctl:
pg_ctl start -D restored_node_PGDATA_directory
Depending on the current and target Postgres Pro Enterprise
versions, you can either upgrade your BiHA cluster in the
LEADER_RO, or LEADER_RW state.
For more details, see the sections below.
The procedure of migration to version 16.8 differs depending on your current Postgres Pro Enterprise version:
to upgrade from version 16.4 and 16.6 to version 16.8, use the procedure described in Migrating the BiHA Cluster in the LEADER_RW State
to upgrade from version 16.3 and earlier to version 16.8, use the procedure described in Migrating the BiHA Cluster in the LEADER_RO State
The procedure of migration to version 16.6 differs depending on your current Postgres Pro Enterprise version:
to upgrade from version 16.4 to version 16.6, use the procedure described in Migrating the BiHA Cluster in the LEADER_RW State
to upgrade from version 16.3 and earlier to version 16.6, use the procedure described in Migrating the BiHA Cluster in the LEADER_RO State
To upgrade your BiHA cluster from Postgres Pro Enterprise 16.3 or earlier to 16.4, use the procedure described in Migrating the BiHA Cluster in the LEADER_RO State.
To upgrade your BiHA cluster from Postgres Pro Enterprise 16.1 and 16.2 to 16.3, use the procedure described in Migrating the BiHA Cluster in the LEADER_RW State.
LEADER_RW State #Use this procedure for the following migrations:
from version 16.4 and 16.6 to version 16.8
from version 16.4 to version 16.6
from versions 16.1 and 16.2 to version 16.3
Take the following steps:
Stop one of the followers using the pg_ctl command.
Upgrade the follower.
Start the follower using the pg_ctl command.
Promote the upgraded follower using the biha.set_leader function.
Stop, upgrade, and start the remaining followers and the old leader.
Promote the old leader node using the biha.set_leader function.
Note that if a node with the Postgres Pro Enterprise
16.1 version goes into the NODE_ERROR state, nodes with
newer versions may determine its state incorrectly, for example, as
REFEREE. In this case, it is recommended to stop the
node, upgrade its version, synchronize it using
pg_rewind, and start it once again.
LEADER_RO State #Use this procedure for the following migrations:
from versions 16.1, 16.2, and 16.3 to version 16.8
from versions 16.1, 16.2, and 16.3 to version 16.6
from versions 16.1, 16.2, and 16.3 to version 16.4
Use the biha.set_nquorum_and_minnodes function
to set the nquorum and minnodes
parameters to the value greater than the number of nodes in the cluster.
For example, if your cluster has 3 nodes, set the above mentioned
parameters to 4.
This is required to avoid unexpected leader elections and to
change the leader state from LEADER_RW to
LEADER_RO.
Wait for the followers to catch up with the leader and ensure that the
replay_lag column in the
pg_stat_replication
view is NULL.
Stop one of the followers using the pg_ctl command.
Upgrade the follower.
Start the follower using the pg_ctl command.
Promote the upgraded follower using the biha.set_leader function.
Stop, upgrade, and start the remaining followers and the old leader.
Promote the old leader using the biha.set_leader function.
Use the biha.set_nquorum_and_minnodes function
to set nquorum and minnodes to
the values, which were used before starting the
Postgres Pro Enterprise upgrade.
Note that if a node with the Postgres Pro Enterprise
16.1 version goes into the NODE_ERROR state, nodes with
newer versions may determine its state incorrectly, for example, as
REFEREE. In this case, it is recommended to stop the
node, upgrade its version, synchronize it using
pg_rewind, and start it once again.
You can either temporarily turn off, or completely remove the biha extension.
Turning off biha
To turn off the extension:
Remove the include 'postgresql.biha.conf' include directive
from the postgresql.conf configuration file.
Ensure that biha is not present in the shared_preload_libraries of the postgresql.conf file and, if applicable, of the postgresql.auto.conf file.
Removing biha completely
To remove the extension completely, do the following:
Execute the DROP EXTENSION
command. It must be executed on the leader in the LEADER_RW state and
from the biha_db database:
biha_db=# DROP EXTENSION biha;