2.3. Перебалансировка данных

2.3.1. Автоматическая перебалансировка данных
2.3.2. Ручная перебалансировка данных

2.3.1. Автоматическая перебалансировка данных

По умолчанию используется режим автоматической перебалансировки. Процесс перебалансировки запускается автоматически после добавления узлов (по умолчанию, если не задан параметр --no-rebalance) или перед удалением узла. Перебалансировку также можно запустить вручную. Суть процесса перебалансировки заключается в равномерном распределении секций для каждой сегментированной таблицы между группами репликации.

Процесс перебалансировки для каждой сегментированной таблицы итерационно определяет группу репликации с максимальным и минимальным количеством секций и создаёт задание на перемещение одной секции в группу репликации с минимальным количеством секций. Этот процесс повторяется, пока соблюдается условие max - min > 1. Для перемещения секций используется логическая репликация. Секции совместно размещённых таблиц перемещаются совместно с секциями сегментированных таблиц, на которые они ссылаются.

Важно помнить, что max_logical_replication_workers должно быть довольно большим, так как процесс перебалансировки использует до max(max_replication_slots, max_logical_replication_workers, max_worker_processes, max_wal_senders)/3 параллельных потоков. На практике можно использовать max_logical_replication_workers = Repfactor + 3 * task_num (task_num — количество параллельных заданий перебалансировки).

Чтобы выполнить перебалансировку сегментированных таблиц в кластере cluster0 вручную, выполните следующую команду (где etcd1, etcd2, etcd3 — это узлы кластера etcd):

                    $ shardmanctl --store-endpoints http://etcd1:2379,http://etcd2:2379,http://etcd3:2379 rebalance
                

Если процесс завершается ошибкой, необходимо вызвать команду shardmanctl cleanup с параметром --after-rebalance.

2.3.2. Ручная перебалансировка данных

Бывают случаи, когда необходимо определённым образом разместить секции сегментированных таблиц по узлам кластера. Для выполнения этой задачи в Shardman поддерживается режим ручной перебалансировки данных.

Как это работает:

  1. Получите список сегментированных таблиц с помощью команды shardmanctl tables sharded list. Вывод будет примерно следующим:

    $ shardmanctl shardmanctl tables sharded list
    
    Sharded tables:
    
    public.doc
    public.resolution
    public.users
    
                                
  2. Запросите информацию о выбранных сегментированных таблицах. Пример:

    $ shardmanctl shardmanctl tables sharded info -t public.users
    
    Table public.users
    
    Partitions:
                                        
    Partition    RgID     Shard                            Master
    0            1        clover-1-shrn1                   shrn1:5432
    1            2        clover-2-shrn2                   shrn2:5432
    2            3        clover-3-shrn3                   shrn3:5432
    3            1        clover-1-shrn1                   shrn1:5432
    4            2        clover-2-shrn2                   shrn2:5432
    5            3        clover-3-shrn3                   shrn3:5432
    6            1        clover-1-shrn1                   shrn1:5432
    7            2        clover-2-shrn2                   shrn2:5432
    8            3        clover-3-shrn3                   shrn3:5432
    9            1        clover-1-shrn1                   shrn1:5432
    10           2        clover-2-shrn2                   shrn2:5432
    11           3        clover-3-shrn3                   shrn3:5432
    12           1        clover-1-shrn1                   shrn1:5432
    13           2        clover-2-shrn2                   shrn2:5432
    14           3        clover-3-shrn3                   shrn3:5432
    15           1        clover-1-shrn1                   shrn1:5432
    16           2        clover-2-shrn2                   shrn2:5432
    17           3        clover-3-shrn3                   shrn3:5432
    18           1        clover-1-shrn1                   shrn1:5432
    19           2        clover-2-shrn2                   shrn2:5432
    20           3        clover-3-shrn3                   shrn3:5432
    21           1        clover-1-shrn1                   shrn1:5432
    22           2        clover-2-shrn2                   shrn2:5432
    23           3        clover-3-shrn3                   shrn3:5432
    
                                
  3. Переместите секцию в новый сегмент, как показано ниже:

    $ shardmanctl --log-level debug tables sharded partmove -t public.users --partnum 1 --shard clover-1-shrn1
    
    2023-07-26T06:00:36.900Z        DEBUG   cmd/common.go:105       Waiting for metadata lock...
    2023-07-26T06:00:36.936Z        DEBUG   rebalance/service.go:256        take extension lock
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
    2023-07-26T06:00:36.938Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
    2023-07-26T06:00:36.951Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
    2023-07-26T06:00:36.951Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
    2023-07-26T06:00:36.952Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
    2023-07-26T06:00:36.952Z        DEBUG   extension/lock.go:35    Waiting for extension lock...
    2023-07-26T06:00:36.976Z        INFO    rebalance/service.go:276        Performing move partition...
    2023-07-26T06:00:36.977Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
    2023-07-26T06:00:36.978Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
    2023-07-26T06:00:36.978Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
    2023-07-26T06:00:36.987Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
    2023-07-26T06:00:36.989Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
    2023-07-26T06:00:36.992Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
    2023-07-26T06:00:36.992Z        DEBUG   rebalance/service.go:71 Performing cleanup after possible rebalance operation failure
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
    2023-07-26T06:00:37.077Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
    2023-07-26T06:00:37.082Z        DEBUG   rebalance/service.go:422        Rebalance will run 1 tasks
    2023-07-26T06:00:37.095Z        DEBUG   rebalance/service.go:452        Guessing that rebalance() can use 3 workers
    2023-07-26T06:00:37.096Z        DEBUG   rebalance/job.go:352    state: Idle     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:37.111Z        DEBUG   rebalance/job.go:352    state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:37.171Z        DEBUG   rebalance/job.go:352    state: WaitInitCopy     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.073Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitInitialCatchup"}
    2023-07-26T06:00:38.073Z        DEBUG   rebalance/job.go:352    state: WaitInitialCatchup       {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.084Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitFullSync"}
    2023-07-26T06:00:38.084Z        DEBUG   rebalance/job.go:352    state: WaitFullSync     {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.108Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "Committing"}
    2023-07-26T06:00:38.108Z        DEBUG   rebalance/job.go:352    state: Committing       {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.254Z        DEBUG   rebalance/job.go:352    state: Complete {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"}
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:583        Produce and process tasks on destination replication groups...
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:594        Produce and process tasks on source replication groups...
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:606        wait all tasks finish
    2023-07-26T06:00:38.258Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 1    {"table": "public.users", "rgid": 1, "action": "analyze"}
    2023-07-26T06:00:38.573Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 2    {"table": "public.users", "rgid": 2, "action": "analyze"}
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
    2023-07-26T06:00:38.833Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
    
                                

    В этом примере секция номер 1 таблицы public.users будет перемещена в сегмент clover-1-shrn1.

    После ручного перемещения секции сегментированной таблицы автоматическая перебалансировка данных отключается для этой таблицы и всех совместно размещённых с ней таблиц.

Для получения списка таблиц с отключённой автоматической перебалансировкой, выполните команду shardmanctl table sharded norebalance. Пример:

$ shardmanctl tables sharded norebalance

public.users

                

Для включения автоматической перебалансировки данных для выбранной сегментированной таблицы, выполните команду shardmanctl tables sharded rebalance, как показано в примере ниже:

$ shardmanctl tables sharded rebalance -t public.users

2023-07-26T07:07:00.657Z        DEBUG   cmd/common.go:105       Waiting for metadata lock...
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
2023-07-26T07:07:00.687Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
2023-07-26T07:07:00.697Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
2023-07-26T07:07:00.698Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
2023-07-26T07:07:00.698Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
2023-07-26T07:07:00.698Z        DEBUG   extension/lock.go:35    Waiting for extension lock...
2023-07-26T07:07:00.719Z        DEBUG   rebalance/service.go:381        Planned moving pnum 21 for table users from rg 1 to rg 2
2023-07-26T07:07:00.719Z        INFO    rebalance/service.go:244        Performing rebalance...
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=1
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=2
2023-07-26T07:07:00.720Z        DEBUG   broadcaster/worker.go:33        start broadcaster worker for repgroup id=3
2023-07-26T07:07:00.732Z        DEBUG   broadcaster/worker.go:51        repgroup 3 connect established
2023-07-26T07:07:00.732Z        DEBUG   broadcaster/worker.go:51        repgroup 1 connect established
2023-07-26T07:07:00.734Z        DEBUG   broadcaster/worker.go:51        repgroup 2 connect established
2023-07-26T07:07:00.734Z        DEBUG   rebalance/service.go:71 Performing cleanup after possible rebalance operation failure
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
2023-07-26T07:07:00.791Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
2023-07-26T07:07:00.795Z        DEBUG   rebalance/service.go:422        Rebalance will run 1 tasks
2023-07-26T07:07:00.809Z        DEBUG   rebalance/service.go:452        Guessing that rebalance() can use 3 workers
2023-07-26T07:07:00.809Z        DEBUG   rebalance/job.go:352    state: Idle     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:00.823Z        DEBUG   rebalance/job.go:352    state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:00.880Z        DEBUG   rebalance/job.go:352    state: WaitInitCopy     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.886Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitInitialCatchup"}
2023-07-26T07:07:01.886Z        DEBUG   rebalance/job.go:352    state: WaitInitialCatchup       {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.904Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitFullSync"}
2023-07-26T07:07:01.905Z        DEBUG   rebalance/job.go:352    state: WaitFullSync     {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:01.932Z        DEBUG   rebalance/job.go:347    current state   {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "Committing"}
2023-07-26T07:07:01.932Z        DEBUG   rebalance/job.go:352    state: Committing       {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:02.057Z        DEBUG   rebalance/job.go:352    state: Complete {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"}
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:583        Produce and process tasks on destination replication groups...
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:594        Produce and process tasks on source replication groups...
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 2    {"table": "public.users", "rgid": 2, "action": "analyze"}
2023-07-26T07:07:02.060Z        DEBUG   rebalance/service.go:606        wait all tasks finish
2023-07-26T07:07:02.321Z        DEBUG   rebalance/service.go:531        Analyzing table public.users in rg 1    {"table": "public.users", "rgid": 1, "action": "analyze"}
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=3
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=2
2023-07-26T07:07:02.587Z        DEBUG   broadcaster/worker.go:75        finish broadcaster worker for repgroup id=1

                

Чтобы включить автоматическую перебалансировку данных для всех сегментированных таблиц, выполните команду shardmanctl rebalance с параметром --force.

$ shardmanctl rebalance --force