6.2. Working with Metrics

6.2. Working with Metrics
Prev	Up	Chapter 6. Configuration and Setup	Home	Next

6.2. Working with Metrics #

6.2.1. General Setup
6.2.2. Use Cases

This section describes the steps required to manage metrics.

6.2.1. General Setup #

6.2.1.1. Adding and Configuring the `postgrespro` Receiver #

To collect metrics from the database instance, add the postgrespro receiver to the receivers section and specify its configuration.

Required configuration:

Specify the database instance connection parameters.
Specify the list of the plugins for data collection.

Additional configuration:

Collection parameters: parallelism, delay, interval.

receivers:
  postgrespro:
    max_threads: 3
    collection_interval: 60s
    initial_delay: 1s
    transport: tcp
    endpoint: localhost:5432
    database: postgres
    username: postgres
    password: ${env:POSTGRESQL_PASSWORD}
    metrics: null
    plugins:
      activity:
        enabled: true
      bgwriter:
        enabled: true
      locks:
        enabled: true
      version:
        enabled: true
      wal:
        enabled: true
      cache:
        enabled: true

Some plugins have additional configuration parameters. For example, the plugins for metrics collection from DBMS objects (tablespaces, databases, tables, indexes) can be configured in such a way that the collection will only take place in a specified number of objects. This allows controlling the load on the database instance and the amount of data sent through the pipeline to an exporter. The detailed description of configuration parameters for each plugin can be found in the /usr/share/doc/pgpro-otel-collector/examples directory.

The receiver can also use Unix sockets when the endpoint is defined as shown below.

receivers:
  postgrespro:
    ...
    transport: unix
    endpoint: /tmp:5432 # Or 'tmp:5432'
    ...

6.2.1.2. Adding and Configuring the `hostmetrics` Receiver #

The hostmetrics receiver is an open-source component of the OpenTelemetry Collector and is used for collecting metrics from the operating system. For detailed information about this receiver, refer to the OpenTelemetry documentation.

To configure the hostmetrics receiver, it is sufficient to list the plugins (scrapers) for data collection. Collection parameters are also available: delay and interval.

Some plugins also have additional configuration parameters.

receivers:
  hostmetrics:
    collection_interval: 60s
    initial_delay: 1s
    scrapers:
      cpu:
        metrics:
          system.cpu.utilization:
            enabled: true
      disk: null
      load: null
      memory: null
      network: null

6.2.1.3. Adding and Configuring the SQL Query Receiver #

The sqlquery receiver is an open-source component of the OpenTelemetry Collector for gathering metrics and/or logs from custom SQL queries.

Warning

The sqlquery receiver is currently experimental and is not recommended for production use.

To set up the sqlquery receiver for collecting metrics, follow the example procedure below.

Create the sqlquery.yml configuration file.

In the created configuration file, specify database connection parameters in the receivers.sqlquery section:

receivers:
  sqlquery:
    driver: postgres
    host: localhost
    port: 5432
    database: postgres
    username: postgres
    password: ${env:POSTGRESQL_PASSWORD}
    # Additional driver-specific connection parameters
    additional_params:
      application_name: pgpro-otel-collector
      sslmode: disable
    # The time interval between query executions. Default: 10s
    collection_interval: 60s
    # Defines setup for the component's own telemetry
    telemetry:
      logs:
        # If true, each executed query is logged at debug level
        query: false
    # The maximum number of open connections to the Postgres Pro server. Default: 0 (unlimited)
    max_open_conn: 5

The password parameter supports environment variable substitution, as shown in the example. Special characters in the credentials are automatically URL-encoded to ensure proper connection string formatting.

Alternatively, use the datasource parameter to provide a complete connection string:

datasource: "host=localhost port=5432 user=postgres password=postgres application_name=pgpro-otel-collector sslmode=disable"

List queries to collect metrics. Each query consists of an SQL statement and a metrics section. There may be several metrics sections, but at least one such section is required. Each metric in the configuration produces one OpenTelemetry metric per row returned from the SQL query.

receivers:
  sqlquery:
    ...
    queries:
      - sql: >-
          SELECT p.id, p.name, p.price, sum(s.units) AS sold
          FROM products p LEFT JOIN sales s ON (p.id = product_id)
          GROUP BY p.id, p.name, p.price
        metrics:
          # Gauge metric example
          # The name assigned to the OpenTelemetry metric
          - metric_name: postgresql.products.price
            value_column: price
            data_type: gauge
            value_type: double
            description: Price of the product
            unit: rub
            static_attributes:
              database: postgres
            attribute_columns: ["name"]
          # Sum metric example
          - metric_name: postgresql.products.sold
            value_column: sold
            data_type: sum
            value_type: int
            description: Total count of sold products
            # Whether a cumulative sum value is monotonically increasing (i.e. never rolls over or resets)
            # Default = false
            monotonic: true
            aggregation: cumulative
            static_attributes:
              database: postgres
            attribute_columns: ["name"]

This example assumes the following database schema:

CREATE TABLE products (id INTEGER, name TEXT, price NUMERIC);
INSERT INTO products VALUES (1, 'Cheese', 9.99), (2, 'Bread', 1.99), (3, 'Milk', 2.99);

CREATE TABLE sales (id BIGINT, product_id BIGINT, units INT, sold_at TIMESTAMP DEFAULT now());
INSERT INTO sales VALUES (1, 1, 10), (2, 2, 20), (2, 2, 50), (3, 3, 30);

The query creates two metrics: a gauge for current product prices and a cumulative sum for total units sold per product.

Use data_type to specify the metric type. Set it to gauge for metrics representing a current value, or sum for metrics representing an aggregated total. When using sum, set the aggregation parameter to either cumulative (default) or delta.

Note

Avoid queries that produce NULL values. If a query returns NULL in a column referenced in the configuration, errors will be logged, but the receiver will continue operating.

Configure exporters, processors, and the service pipeline:

...
exporters:
  prometheus:
    endpoint: :8889
    send_timestamps: true
  otlphttp/sqlquery/metrics:
    compression: ''
    endpoint: https://metrics.example.org:8080
    headers:
      X-Ppem-Source-Agent-Name: local
      X-Ppem-Source-Instance-Port: '5432'
processors:
  batch/sqlquery:
    send_batch_size: 2048
    timeout: 10s
service:
  # Telemetry for the collector itself
  telemetry:
    logs:
      # Sets the minimum enabled logging level
      # Values: debug, info, warn, error
      # Default = info
      level: info
  pipelines:
    metrics/sqlquery:
      receivers: [ sqlquery ]
      processors: [ batch/sqlquery ]
      exporters: [ prometheus, otlphttp/sqlquery/metrics ]

Start pgpro-otel-collector with the created configuration file:

build/pgpro-otel-collector/pgpro-otel-collector --config configs/sqlquery.yml

For the full list of the sqlquery configuration parameters, refer to the OpenTelemetry documentation.

6.2.1.4. Adding and Configuring the `metricstransform` Processor #

The metricstransform processor is an open-source component of the OpenTelemetry Collector for renaming metrics and managing labels, including scaling and aggregation operations. For more details, refer to the OpenTelemetry documentation.

The example setup of the metricstransform processor is shown in the procedure below.

Create the metrics_transform.yml configuration file.

Add the configuration sections that match your needs from the examples below:

Rename a metric — for example, postgresql.databases.size_bytes to postgresql.db.size_bytes:
```
processors:
  metricstransform/rename:
    transforms:
      - include: postgresql.databases.size_bytes
        # Default = strict
        match_type: strict
        action: update
        new_name: postgresql.db.size_bytes
```
The action parameter controls the transformation type: update modifies existing metrics, insert creates cloned metrics, and combine merges multiple metrics into a new one.

Use regular expressions to rename multiple metrics at once:

  metricstransform/rename_by_regexp:
    transforms:
      - include: ^postgresql\.databases\.(.*)$$
        match_type: regexp
        action: update
        new_name: postgresql.db.$${1}

Create a duplicate of the metric with a new name:

  metricstransform/create_new:
    transforms:
      - include: postgresql.databases.size_bytes
        match_type: strict
        action: insert
        new_name: postgresql.db.size_bytes

Add a new label to a metric:

  metricstransform/add_label:
      transforms:
        - include: postgresql.databases.size_bytes
          action: update
          operations:
            - action: add_label
              new_label: new_label
              new_value: "value 1"

Rename an existing label:

  metricstransform/rename_label:
    transforms:
      - include: postgresql.databases.size_bytes
        action: update
        operations:
          - action: update_label
            label: database
            new_label: db

Rename multiple labels using a regular expression:

  metricstransform/rename_label_multiple:
    transforms:
      - include: ^postgresql\.databases\.(.*)$$
        match_type: regexp
        action: update
        operations:
          - action: update_label
            label: database
            new_label: db

Rename a specific label value:

  metricstransform/rename_label_value:
    transforms:
      - include: postgresql.databases.size_bytes
        action: update
        operations:
          - action: update_label
            label: database
            value_actions:
              - value: postgres
                new_value: default

Delete data points that have a certain label value:

  metricstransform/delete_by_label_value:
    transforms:
      - include: postgresql.databases.size_bytes
        action: update
        operations:
          - action: delete_label_value
            label: database
            # Specifies the label value whose data points will be removed
            label_value: db11

Convert data type from double to int and vice versa:

  metricstransform/convert_data_type:
    transforms:
      - include: postgresql.databases.size_bytes
        action: update
        operations:
          - action: toggle_scalar_data_type

Aggregate data points that have the labels excluded in label_set:

  metricstransform/aggregate_labels:
    transforms:
      - include: postgresql.activity.connections
        action: update
        operations:
          - action: aggregate_labels
            # Contains a list of labels that will remain after aggregation
            label_set:
              #- database
              - user
              #- backend_process_state
            # Defines how combined data points will be aggregated
            # Possible values: sum, mean, min, max, count, median
            aggregation_type: sum

Note that only the sum aggregation function is supported for histogram and exponential histogram data types.

Aggregate data points that have a specific label value:

  metricstransform/aggregate_label_values:
    transforms:
      - include: postgresql.activity.connections
        action: update
        operations:
          - action: aggregate_label_values
            label: database
            # Contains a list of label values that will be aggregated
            aggregated_values: [ db11, db12, db13 ]
            new_value: all_db1
            aggregation_type: sum

Note that only the sum aggregation function is supported for histogram and exponential histogram data types.

Combine several related metrics into one. For example, combine multiple tuples_* metrics into tuples_total with a type label:
```
  metricstransform/combine_metrics:
    transforms:
      - include: ^postgresql\.databases\.tuples_(?P<type>.*)$$
        match_type: regexp
        action: combine
        new_name: postgresql.databases.tuples_total
        submatch_case: lower
```
The submatch_case parameter controls the case of label values extracted from regular expression submatches during combine operations. Leave empty to preserve the original case. Possible values: lower or upper.

Group metrics from a single resource and report them as multiple resource metrics:

  metricstransform/group_metrics:
    transforms:
      - include: ^postgresql.databases.size_bytes$$
        match_type: regexp
        action: group
        group_resource_labels: {"resource.type": "default", "source": "default"}
      - include: ^postgresql.databases.orphaned_files_size_bytes$$
        match_type: regexp
        action: group
        group_resource_labels: {"resource.type": "orphaned", "source": "orphaned"}

Create a new metric filtered by a label value (experimental feature):

  metricstransform/experimental_create_by_label:
    transforms:
      - include: postgresql.databases.size_bytes
        match_type: regexp
        experimental_match_labels: {"database": "db1.*"}
        action: insert
        new_name: postgresql.db1.size

If experimental_match_labels is specified, transformations apply only to metrics that match the specified label values. This works with both strict and regexp match_type.

Scale metric values from bytes to bits (experimental feature):

  metricstransform/experimental_scale_value:
    transforms:
      - include: postgresql.databases.size_bytes
        match_type: strict
        action: insert
        new_name: postgresql.databases.size_bits
        operations:
          - action: experimental_scale_value
            experimental_scale: 8

The experimental_scale parameter defines the scalar to apply to metric values. Note that scaling exponential histograms inherently involves some loss of accuracy.

Configure exporters and the service pipeline in accordance with the chosen sections:

exporters:
  prometheus:
    endpoint: :8889
    send_timestamps: true
    # translation_strategy: UnderscoreEscapingWithoutSuffixes
    # If true, all the resource attributes will be converted to metric labels by default
    # resource_to_telemetry_conversion:
    #   enabled: true
service:
  pipelines:
    metrics:
      receivers:
        - postgrespro
      processors:
        - metricstransform/rename
        # - metricstransform/rename_by_regexp
        # - metricstransform/create_new
        # - metricstransform/add_label
        # - metricstransform/rename_label
        # - metricstransform/rename_label_multiple
        # - metricstransform/rename_label_value
        # - metricstransform/delete_by_label_value
        # - metricstransform/convert_data_type
        # - metricstransform/aggregate_labels
        # - metricstransform/aggregate_label_values
        # - metricstransform/combine_metrics
        # - metricstransform/group_metrics
        # - metricstransform/experimental_create_by_label
        # - metricstransform/experimental_scale_value
      exporters:
        - prometheus

Start pgpro-otel-collector with both the main configuration file and metrics_transform.yml:

build/pgpro-otel-collector/pgpro-otel-collector --config configs/basic.yml --config configs/metrics_transform.yml

For a complete list of configuration parameters, refer to the OpenTelemetry documentation.

6.2.1.5. Adding and Configuring the `prometheus` Exporter #

The prometheus exporter is an open-source component of the OpenTelemetry Collector. For detailed information, refer to the OpenTelemetry documentation.

prometheus is the easiest to use — it does not require the external component configuration and can be enabled by default. To set it up, it is sufficient to specify the address to listen for incoming requests:

exporters:
  prometheus:
    endpoint: "1.2.3.4:8889"
    send_timestamps: true

6.2.1.6. Adding and Configuring the `otlphttp` Exporter #

The otlphttp exporter is an open-source component of the OpenTelemetry Collector and is used for exporting collected logs to an OTLP-compatible storage or monitoring system that has to be predeployed and accessible. For more details, refer to the OpenTelemetry documentation.

To configure the otlphttp exporter, it is sufficient to specify the address of the target system where data should be sent:

exporters:
  otlphttp:
    endpoint: https://otlp.example.org

6.2.1.7. Adding and Configuring the `kafka` Exporter #

The kafka exporter is an open-source component of the OpenTelemetry Collector for sending metrics and logs to Apache Kafka. For more details, refer to the OpenTelemetry documentation.

Below is the example of setting it up for sending metrics.

receivers:
  postgrespro:
    max_threads: 3
    collection_interval: 60s
    initial_delay: 1s
    transport: tcp
    endpoint: localhost:5432
    database: postgres
    username: postgres
    password: ${env:POSTGRESQL_PASSWORD}
    metrics: null
    plugins:
      activity:
        enabled: true
      bgwriter:
        enabled: true
      locks:
        enabled: true
      version:
        enabled: true
      wal:
        enabled: true
      cache:
        enabled: true
exporters:
  kafka:
    brokers:
      - localhost:9092
    protocol_version: 2.1.0
    client_id: pgpro-otel-collector
    metrics:
      topic: otlp_metrics
      encoding: otlp_json # proto supported
    include_metadata_keys:
      - service.name
      - service.instance.id
    tls:
      insecure: true
    timeout: 30s
    producer:
      max_message_bytes: 1000000
      required_acks: 1
      compression: none # gzip, snappy, lz4, and zstd;
processors:
  batch/kafka:
    send_batch_size: 1024
    timeout: 1s
  resource:
    attributes:
      - key: service.name
        action: upsert
        value: postgresql
      - key: service.instance.id
        action: upsert
        value: address-of-postgres-instance:5432
service:
  pipelines:
    metrics/kafka:
      receivers: [ postgrespro ]
      processors: [ batch/kafka,resource ]
      exporters: [ kafka ]

6.2.1.8. Setting up a Pipeline #

Once receivers and exporters are added and configured, they need to be combined into a pipeline. The pipeline is configured in the service section. The pipeline contents depend altogether on the previously added components (there is no default configuration).

The example below shows how to set up a pipeline for metric management. The data is collected by the postgrespro and hostmetrics receivers, processed by the batch processor and exported by the prometheus and otlphttp exporters.

Thus, all the components used in the pipeline should also be added in the configuration file and set up.

service:
  extensions: []
  pipelines:
    metrics:
      receivers:
        - postgrespro
        - hostmetrics
      processors:
        - batch
      exporters:
        - prometheus
        - otlphttp

6.2.2. Use Cases #

6.2.2.1. Collecting Metrics from a BiHA Cluster #

pgpro-otel-collector can be configured to collect metrics from BiHA clusters using the biha plugin. The example procedure for such setup is as follows:

Create a configuration file called biha.yml with the following content:

receivers:
  postgrespro/biha:
    transport: tcp
    endpoint: &endpoint localhost:5432
    database: biha_db
    username: biha_replication_user
    password: ${env:POSTGRESQL_PASSWORD}
    collection_interval: 60s
    initial_delay: 1s
    max_threads: 3
    plugins:
      biha:
        enabled: true
        cluster_name: cluster_name
exporters:
  prometheus/biha:
    endpoint: :8889
    send_timestamps: true
    # Defines how long metrics remain exposed without updates
    # Since some BiHA metrics include 'biha_state' in their labels,
    # it is best to set this parameter equal to the collection_interval,
    # so that outdated states do not appear on the Prometheus page.
    metric_expiration: 60s
processors:
  batch/metrics:
    send_batch_size: 8192
    timeout: 10s
  memory_limiter/metrics:
    check_interval: 1s
    limit_mib: 2048
service:
  pipelines:
    metrics/biha:
      receivers:
        - postgrespro/biha
      processors:
        - memory_limiter/metrics
        - batch/metrics
      exporters:
        - prometheus/biha

Start pgpro-otel-collector with the created configuration file:

build/pgpro-otel-collector/pgpro-otel-collector --config configs/biha.yml

To check which metrics are included in the biha plugin, refer to Section 8.18.

Note

Plugins with special access privileges must be configured separately.

If you need to include plugins that require special access privileges, add all of them in a separate postgrespro section with their own usernames. For example:

receivers:
   postgrespro/biha:
     username: biha_replication_user
     ...
     plugins:
       biha:
         enabled: true
         ...
   postgrespro/otel:
     username: otel
     ...
     plugins:
       databases:
         enabled: true

For more details on the access privileges, refer to Section 3.1.

Note

By default, the postgres database is not copied to a node in the referee or referee_with_wal mode (unless the --referee-with-postgres-db option is used).

When collecting database-level, table-level, index-level, or function-level statistics from these nodes, you must explicitly exclude the postgres database using the acl.databases.deny section, as shown below. This exclusion is not required when collecting per-instance statistics. For more information about allowlists and denylists, refer to Section 6.6.5.

acl:
  databases:
    deny:
      - name: postgres