외부스크립트를 이용해서 HAProxy node 체크하기

HAProxy + MariaDB Galera Cluster 구성을 하고 테스트를 진행하니 문제가 발생해서 글을 남겨본다.

HAProxy

haproxy.cfg 설정파일에서 아래 옵션을 사용

option mysql-check user haproxy

이 옵션은 TCP 포트로의 접속가능 여부만 판단해서 health check를 한다.
Galera Cluster의 경우 노드가 복구될 때 쿼리를 하면 아래 메시지가 출력된다.

WSREP has not yet prepared node for application use

따라서 이 경우는 HAProxy에서 노드를 살려줘서는 안된다.

MariaDB Galera Cluster

MariaDB [(none)] > show global status where variable_name='wsrep_local_state';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| wsrep_local_state | 4     |
+-------------------+-------+

동기화가 완료되서 사용이 가능한 상태값이다.
다른 상태값은 아래 링크에서 확인한다.
https://mariadb.com/docs/ent/ref/mdb/status-variables/wsrep_local_state/

HAProxy external-check

기존에 mysql-check를 외부 스크립트를 이용하도록 변경한다.
외부 스크립트는 wsrep_local_state 값을 읽어서 원하는 값이면 exit 0 을 리턴하고 다른값이면 exit 255를 리턴시켜주면 된다.

/etc/haproxy/haproxy.cfg

global
        log /dev/log    local0
        log /dev/log    local1 notice	
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
        external-check
        insecure-setuid-wanted

frontend mysql-front
        bind *:3306
        mode tcp
        default_backend mysql-back

backend mysql-back
        mode tcp        
	#option mysql-check user haproxy
	option external-check
        external-check path "/usr/bin:/bin"
        external-check command /etc/haproxy/script/check_node.sh
	balance roundrobin
        server db1 172.16.0.20:3306 check
        server db2 172.16.0.21:3306 check backup weight 2
        server db3 172.16.0.22:3306 check backup weight 1
  1. global 항목에 external-check 값이 추가되었다.
  2. external-check를 사용하기위해서 insecure-setuid-wanted 값을 추가해야한다
    https://www.haproxy.com/blog/announcing-haproxy-2-2/#security-hardening
  3. backend 항목에 기존 mysql-check 항목을 external-check로 변경
  4. external-check path 추가
  5. external-check command 추가

/etc/haproxy/script/check_node.sh

#!/bin/bash

MYSQL_HOST="$3"
MYSQL_PORT="$4"
MYSQL_USERNAME="haproxy"
MYSQL_PASSWORD=""
MYSQL_BIN="/bin/mysql"
MYSQL_OPTS="--no-defaults --connect-timeout=10 -snNE"
CMDLINE="$MYSQL_BIN $MYSQL_OPTS --host=$MYSQL_HOST --port=$MYSQL_PORT --user=$MYSQL_USERNAME -e"
WSREP_LOCAL_STATE_CHK=`$CMDLINE "SHOW GLOBAL STATUS WHERE VARIABLE_NAME='wsrep_local_state'" | tail -1`

if [ "$WSREP_LOCAL_STATE_CHK" != "4" ]; then
    exit 255
fi

exit 0

파라메터 정보

$1 = Virtual Service IP (VIP)
$2 = Virtual Service Port (VPT)
$3 = Real Server IP (RIP)
$4 = Real Server Port (RPT) 
$5 = Check Source IP 

$3, $4를 이용해서 노드의 상태를 읽는다.

HAProxy + Galera Cluster (MariaDB) 2/2

HAProxy

우분투 20.04 에서 진행한다. 패키지관리자로 설치가 가능하다. SSL을 지원하는 2.x 버전이 설치된다.

apt install -y haproxy
root@haproxy1:~# haproxy --version
HA-Proxy version 2.0.29-0ubuntu1 2022/08/26 - https://haproxy.org/
Usage : haproxy [-f <cfgfile|cfgdir>]* [ -vdVD ] [ -n <maxconn> ] [ -N <maxpconn> ]
        [ -p <pidfile> ] [ -m <max megs> ] [ -C <dir> ] [-- <cfgfile>*]
        -v displays version ; -vv shows known build options.
        -d enters debug mode ; -db only disables background mode.
        -dM[<byte>] poisons memory with <byte> (defaults to 0x50)
        -V enters verbose mode (disables quiet mode)
        -D goes daemon ; -C changes to <dir> before loading files.
        -W master-worker mode.
        -Ws master-worker mode with systemd notify support.
        -q quiet mode : don't display messages
        -c check mode : only check config files and exit
        -n sets the maximum total # of connections (uses ulimit -n)
        -m limits the usable amount of memory (in MB)
        -N sets the default, per-proxy maximum # of connections (0)
        -L set local peer name (default to hostname)
        -p writes pids of all children to this file
        -de disables epoll() usage even when available
        -dp disables poll() usage even when available
        -dS disables splice usage (broken on old kernels)
        -dG disables getaddrinfo() usage
        -dR disables SO_REUSEPORT usage
        -dr ignores server address resolution failures
        -dV disables SSL verify on servers side
        -sf/-st [pid ]* finishes/terminates old pids.
        -x <unix_socket> get listening sockets from a unix socket
        -S <bind>[,<bind options>...] new master CLI

haproxy 설정

파일을 생성하거나 수정해야 한다. 각 HAProxy 노드별로 설정파일이 같다.

/etc/haproxy/haproxy.cfg

global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

listen stats
        bind *:8080
        stats enable
        stats uri /
        stats realm HAProxy\ Statistics
        stats auth admin:1234
        stats refresh 10s

frontend mysql-front
        bind *:3306
        mode tcp
        default_backend mysql-back

backend mysql-back
        mode tcp
        option mysql-check user haproxy
        balance roundrobin
        server db1 172.16.0.20:3306 check
        server db2 172.16.0.21:3306 check backup weight 2
        server db3 172.16.0.22:3306 check backup weight 1

이 설정은 db1 노드를 Active 상태로 두고 나머지는 stand-by한다. db1 노드에 장애가 발생하면 다음번 노드가 활성화된다.

mysql-check 옵션

백엔드에 입력된 노드상태 확인을 위해서 mysql 접근이 가능해야한다. 따라서 각 데이터베이스 노드에 haproxy 유저를 추가해준다.

MariaDB [(none)] > create user 'haproxy'@'172.16.0.%';

Keepalived

HAProxy를 이중화하기 위해서 keepalived가 필요하다. VIP를 가지고 있다가 마스터 노드에 장애가 발생하면 슬래이브 노드에 VIP를 바인딩해준다.

apt install -y keepalived

로컬 호스트주소외 다른 가상아이피를 바인딩할 수 있도록 설정을 변경해야 한다.
아래파일에 추가하거나 수정한다.

/etc/sysctl.conf

net.ipv4.ip_nonlocal_bind = 1
root@haproxy1:/etc/keepalived# sysctl -p
net.ipv4.ip_nonlocal_bind = 1

HAProxy 1번노드

/etc/keepalived/keepalived.conf

global_defs {
        router_id HAProxy1
}

vrrp_script check_haproxy {
        script "killall -0 haproxy"
        interval 2
        weight 2
}

vrrp_instance VIS_TEST {
        interface         eth0
        state             MASTER
        priority          101
        virtual_router_id 51
        advert_int        1

        virtual_ipaddress {
                172.16.0.5
        }

        track_script {
                check_haproxy
        }
}

HAProxy 2번노드

/etc/keepalived/keepalived.conf

global_defs {
        router_id HAProxy2
}

vrrp_script check_haproxy {
        script "killall -0 haproxy"
        interval 2
        weight 2
}

vrrp_instance VIS_TEST {
        interface         eth0
        state             MASTER
        priority          100
        virtual_router_id 51
        advert_int        1

        virtual_ipaddress {
                172.16.0.5
        }

        track_script {
                check_haproxy
        }
}

router_id 값은 각 노드마다 다르다.
priority 값은 마스터가 값이 더 커야 한다.
virtual_ipaddress 값은 VIP로 마스터가 이 VIP를 바인딩하다 장애가 발생하면 다른 노드가 이 VIP를 바인딩해서 이중화가 가능하다.

서비스 재시작

systemctl restart keepalived.service

네트워크 확인

1번 노드에 VIP가 바인딩된 것이 보인다. 1번노드에서 haproxy 서비스를 중지시키면 2번 노드에 VIP가 바인딩되는 것을 확인할 수 있다.

root@haproxy1:/etc/haproxy# ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.10/24 brd 172.16.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 172.16.0.5/32 scope global eth0
       valid_lft forever preferred_lft forever
root@haproxy2:/etc/haproxy# ip a
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.11/24 brd 172.16.0.255 scope global eth0
       valid_lft forever preferred_lft forever

HAProxy + Galera Cluster (MariaDB) 1/2

Galera Cluster

우분투 20.04 에서 진행한다. 패키지관리자에서 제공하는 mariadb-server를 설치하면 galera3이 같이 설치된다.

$ apt update && apt upgrade -y
$ apt install -y mariadb-server

Galera 클러스터 설정하기

  1. wsrep_cluster_address : 전체 노드의 아이피를 쉼표로 구분해서 모두 입력
  2. wsrep_node_address : 현재 노드의 아이피를 입력
  3. wsrep_node_name : 현재 노드를 구분할 이름을 입력

Database1

/etc/mysql/mariadb.conf.d/60-galera.cnf

[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

# Galera Cluster Configuration
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://172.16.0.20,172.16.0.21,172.16.0.22"

# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="172.16.0.20"
wsrep_node_name="node1"

Database2

/etc/mysql/mariadb.conf.d/60-galera.cnf

[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

# Galera Cluster Configuration
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://172.16.0.20,172.16.0.21,172.16.0.22"

# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="172.16.0.21"
wsrep_node_name="node2"

Database3

/etc/mysql/mariadb.conf.d/60-galera.cnf

[galera]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

# Galera Cluster Configuration
wsrep_cluster_name="galera_cluster"
wsrep_cluster_address="gcomm://172.16.0.20,172.16.0.21,172.16.0.22"

# Galera Synchronization Configuration
wsrep_sst_method=rsync

# Galera Node Configuration
wsrep_node_address="172.16.0.22"
wsrep_node_name="node3"

클러스터 초기화

첫번째 노드에서 실행한다

$ galera_new_cluster

서비스 재시작

각 노드별로 실행해준다

$ systemctl restart mariadb.service

클러스터 상태 확인

$ mysql -uroot -p
MariaDB [(none)] > show status like 'wsreq%';
+-------------------------------+----------------------------------------------------------+
| Variable_name                 | Value                                                    |
+-------------------------------+----------------------------------------------------------+
| wsrep_applier_thread_count    | 1                                                        |
| wsrep_apply_oooe              | 0.000000                                                 |
| wsrep_apply_oool              | 0.000000                                                 |
| wsrep_apply_window            | 1.000000                                                 |
| wsrep_causal_reads            | 0                                                        |
| wsrep_cert_deps_distance      | 95.038207                                                |
| wsrep_cert_index_size         | 106                                                      |
| wsrep_cert_interval           | 0.000000                                                 |
| wsrep_cluster_conf_id         | 22                                                       |
| wsrep_cluster_size            | 3                                                        |
| wsrep_cluster_state_uuid      | 2f9b1216-670b-11ed-a410-02c6026a7d9b                     |
| wsrep_cluster_status          | Primary                                                  |
| wsrep_cluster_weight          | 3                                                        |
| wsrep_commit_oooe             | 0.000000                                                 |
| wsrep_commit_oool             | 0.000000                                                 |
| wsrep_commit_window           | 1.000000                                                 |
| wsrep_connected               | ON                                                       |
| wsrep_desync_count            | 0                                                        |
| wsrep_evs_delayed             |                                                          |
| wsrep_evs_evict_list          |                                                          |
| wsrep_evs_repl_latency        | 0.00183486/0.00210722/0.00237959/0.000272366/2           |
| wsrep_evs_state               | OPERATIONAL                                              |
| wsrep_flow_control_paused     | 0.000068                                                 |
| wsrep_flow_control_paused_ns  | 395662483                                                |
| wsrep_flow_control_recv       | 15                                                       |
| wsrep_flow_control_sent       | 1                                                        |
| wsrep_gcomm_uuid              | 9ca4e10d-6a23-11ed-9e58-135c9b8d1ba7                     |
| wsrep_incoming_addresses      | 172.16.0.20:3306,172.16.0.21:3306,172.16.0.22:3306 |
| wsrep_last_committed          | 1941702                                                  |
| wsrep_local_bf_aborts         | 0                                                        |
| wsrep_local_cached_downto     | 1885300                                                  |
| wsrep_local_cert_failures     | 0                                                        |
| wsrep_local_commits           | 0                                                        |
| wsrep_local_index             | 1                                                        |
| wsrep_local_recv_queue        | 0                                                        |
| wsrep_local_recv_queue_avg    | 0.401237                                                 |
| wsrep_local_recv_queue_max    | 110                                                      |
| wsrep_local_recv_queue_min    | 0                                                        |
| wsrep_local_replays           | 0                                                        |
| wsrep_local_send_queue        | 0                                                        |
| wsrep_local_send_queue_avg    | 0.000000                                                 |
| wsrep_local_send_queue_max    | 1                                                        |
| wsrep_local_send_queue_min    | 0                                                        |
| wsrep_local_state             | 4                                                        |
| wsrep_local_state_comment     | Synced                                                   |
| wsrep_local_state_uuid        | 2f9b1216-670b-11ed-a410-02c6026a7d9b                     |
| wsrep_open_connections        | 0                                                        |
| wsrep_open_transactions       | 0                                                        |
| wsrep_protocol_version        | 9                                                        |
| wsrep_provider_name           | Galera                                                   |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>                        |
| wsrep_provider_version        | 3.29(ra60e019)                                           |
| wsrep_ready                   | ON                                                       |
| wsrep_received                | 57714                                                    |
| wsrep_received_bytes          | 17158682                                                 |
| wsrep_repl_data_bytes         | 0                                                        |
| wsrep_repl_keys               | 0                                                        |
| wsrep_repl_keys_bytes         | 0                                                        |
| wsrep_repl_other_bytes        | 0                                                        |
| wsrep_replicated              | 0                                                        |
| wsrep_replicated_bytes        | 0                                                        |
| wsrep_rollbacker_thread_count | 1                                                        |
| wsrep_thread_count            | 2                                                        |
+-------------------------------+----------------------------------------------------------+

wsrep_cluster_size : 클러스터에 가입된 있는 노드수
wsrep_incoming_addresses : 클러스터에 가입된 아이피
wsrep_local_state_comment : 현재 노드 상태 (다른 노드들과 데이터가 동기화되어 있는 상태면 Synced)

Ubuntu 20.04LTS + mariadb 10.5 (replication)

문자메시지 발송서비스 서버 구축을 위해서 mariadb (replication) 를 사용
1,000,000건/일 + @ 처리가 가능해야 한다

서버스펙

Xeon (16core)
samsung 860 pro 1Tb * 2 (RAID0)
32Gb memory

구성

L4 : Active – Standby
mariadb : Master – Slave

서버설치

http://mirror.kakao.com/ubuntu-releases/focal/

mariadb 설치

sudo apt-get install software-properties-common dirmngr apt-transport-https
sudo apt-key adv --fetch-keys 'https://mariadb.org/mariadb_release_signing_key.asc'
sudo add-apt-repository 'deb [arch=amd64,arm64,ppc64el] https://mirror.yongbok.net/mariadb/repo/10.5/ubuntu focal main'

sudo apt update
sudo apt install mariadb-server

mysql 사용자 설정

mysql -uroot -p
# create user username@localhost identified by 'password';
create user my@'%' identified by '1234';
# grant select on database.table to username@localhost;
grant select on *.* to my@'%';
# grant all privileges on database.table to username@localhost;
grant all privileges on *.* to my@'%';
flush privileges;

replication

master

[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log

vim /etc/mysql/mariadb.conf.d/50-server.cnf

systemctl restart mariadb

Welcome to the MariaDB monitor.  Commands end with ; or \g.
 Your MariaDB connection id is 37
 Server version: 10.5.9-MariaDB-1:10.5.9+maria~focal-log mariadb.org binary distribution
 Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 MariaDB [(none)]> show master status;
 +------------------+----------+--------------+------------------+
 | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
 +------------------+----------+--------------+------------------+
 | mysql-bin.000001 |     2098 |              |                  |
 +------------------+----------+--------------+------------------+
 1 row in set (0.000 sec)
 MariaDB [(none)]>
show master status;

slave

[mysqld]
server-id = 2
relay_log=mysql-relay-bin
log_slave_updates = 1
read_only = 1

vim /etc/mysql/mariadb.conf.d/50-server.cnf

systemctl restart mariadb

Welcome to the MariaDB monitor.  Commands end with ; or \g.
 Your MariaDB connection id is 36
 Server version: 10.5.9-MariaDB-1:10.5.9+maria~focal mariadb.org binary distribution
 Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

 MariaDB [(none)]> stop slave;
 MariaDB [(none)]> CHANGE MASTER TO
     -> MASTER_HOST='###.###.###.###',
     -> MASTER_PORT=3306,
     -> MASTER_USER='repl',
     -> MASTER_PASSWORD='1234',
     -> MASTER_LOG_FILE='mysql-bin.000001',
     -> MASTER_LOG_POS=2098;
 MariaDB [(none)]> start slave;
 MariaDB [(none)]> show slave status \G;
 * 1. row *
                 Slave_IO_State: Waiting for master to send event
                    Master_Host: ###.###.###.###
                    Master_User: repl
                    Master_Port: 3306
                  Connect_Retry: 60
                Master_Log_File: mysql-bin.000001
            Read_Master_Log_Pos: 2098
                 Relay_Log_File: mysql-relay-bin.000002
                  Relay_Log_Pos: 906
          Relay_Master_Log_File: mysql-bin.000001
               Slave_IO_Running: Yes
              Slave_SQL_Running: Yes
                Replicate_Do_DB:
            Replicate_Ignore_DB:
             Replicate_Do_Table:
         Replicate_Ignore_Table:
        Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
                     Last_Errno: 0
                     Last_Error:
                   Skip_Counter: 0
            Exec_Master_Log_Pos: 2098
                Relay_Log_Space: 1215
                Until_Condition: None
                 Until_Log_File:
                  Until_Log_Pos: 0
             Master_SSL_Allowed: No
             Master_SSL_CA_File:
             Master_SSL_CA_Path:
                Master_SSL_Cert:
              Master_SSL_Cipher:
                 Master_SSL_Key:
          Seconds_Behind_Master: 0
  Master_SSL_Verify_Server_Cert: No
                  Last_IO_Errno: 0
                  Last_IO_Error:
                 Last_SQL_Errno: 0
                 Last_SQL_Error:
    Replicate_Ignore_Server_Ids:
               Master_Server_Id: 1
                 Master_SSL_Crl:
             Master_SSL_Crlpath:
                     Using_Gtid: No
                    Gtid_IO_Pos:
        Replicate_Do_Domain_Ids:
    Replicate_Ignore_Domain_Ids:
                  Parallel_Mode: optimistic
                      SQL_Delay: 0
            SQL_Remaining_Delay: NULL
        Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
               Slave_DDL_Groups: 3
 Slave_Non_Transactional_Groups: 0
     Slave_Transactional_Groups: 0
 1 row in set (0.000 sec)
 ERROR: No query specified
 MariaDB [(none)]>
show slave status \G;

mysql_native_password

auth_socket 인증을 변경

ALTER user 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '변경할비밀번호';

권한정보 업데이트

flush privileges;

오류처리1

ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

비밀번호 처리방식 변경

SET GLOBAL validate_password_policy = LOW
SET GLOBAL validate_password_length = 4

설정을 변경하기(mysql.conf)

[mysqld]
validate_password_policy=LOW
validate_password_length=4