An in depth exploration of the Postgresql master slave asynchronous flow replication scheme

  • 2020-05-06 11:56:27
  • OfStack

Preface to

Database backup is extremely important in daily production. If you ask DBA how to design a highly available data backup and recovery plan, many people will give a lot of architectural advice on disaster tolerance. But at the end of the day, if a distributed environment is involved in a database in a business context, I think a good solution needs to meet three major requirements:

Multi-replica persistence consistency

In the daily architecture design, we should not only guarantee the successful backup of the data, but also ensure that the backup data can be recovered quickly. In many backup recovery reliability scheme master-slave replication technology, can be said to be one of the most common implementation, this paper mainly introduces postgresql main case database of asynchronous replication environment building and operating practice of main/backup, in addition to apply some basic principles in the daily database operations, can deepen understanding of the underlying knowledge of Postgresql database.

After 9.0, postgres introduced the master-slave stream replication mechanism. The so-called stream replication means that the slave server synchronizes the corresponding data from the master server through the tcp stream. This allows you to have a backup from the server if the master server data is lost.

In contrast to file based log shipping, stream replication allows you to keep updates from the server. The slave server connects to the master server and the resulting stream WAL is recorded to the slave server without waiting for the master server to finish writing the WAL file.

PostgreSQL stream replication is asynchronous by default. There is a small delay between committing a transaction on the primary server and changes visible from the server. This delay is much smaller than file-based log shipping, which typically completes in 1 second. If the primary server crashes suddenly, a small amount of data may be lost.

Synchronous replication cannot commit a transaction until both the master and slave servers have written WAL. This somewhat increases the response time of the transaction.

The following learning and practice focuses on asynchronous stream replication of PostgreSQL (this article does not cover synchronous replication, logical replication, etc., if you want to know about other backup schemes, you can read the relevant official documents or other information).

The central idea of asynchronous flow replication is that when a transaction is committed on the primary library, it does not need to wait for the standby library to receive the WAL log stream and write to the standby WAL log file to return success, so the TPS for asynchronous flow replication is higher and has lower latency than for synchronous flow replication.

The environment prepares

操作系统 服务器IP 节点名称 角色
centos 7.2 172.17.0.2 pghost1 主库
centos 7.2 172.17.0.5 pghost2 备库

Main directory specification:

Data directory: / data pg10 / pg_root tablespace directory: / data/pg10 / pg_tbs application directory: / apps/svr/pgsql

Note to : we use root account to compile and install Pg, but in general, we should use pg super administrator account other than root for database deployment operations, so we need to create relevant users and directories in advance and set the relevant permissions:


$ groupadd postgres

$ useradd postgres -g postgres

$ passwd postgres

$ mkdir -p /data/pg10/pg_root

$ mkdir -p /data/pg10/tbs

$ chown -R postgres:postgres /data/pg10

The experimental postgresql is version 10.0

pghost1 and pghost2 download the version of the source installation package

, respectively

wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz

Unzip

after downloading

tar -zxvf postgresql-10.0.tar.gz

relies on
before installation

Since configure process relies on operating system packages zlib, readline, etc., I used yum pre-installed:


yum groupinstall "Development tools " 

yum install -y bison flex readline readline-devel zlib zlib-devel

Install
in primary and backup database

Before installing, we create

, the preferred environment variable postgresql, for pghost1 and pghost2, respectively

vi /etc/profile.d/pgsql.sh

Add the following:


export PGPORT=1921
export PGUSER=postgres
export PGDATA=/data/pg10/pg_root
export LANG=en_US.utf8
export PGHOME=/apps/svr/pgsql
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib
export PATH=$PGHOME/bin:$PATH:.
export MANPATH=$PGHOME/share/man:$MANPATH

alias rm='rm -i'
alias ll='ls -lh'

Save the file and let the environment variable take effect :


source /etc/profile.d/pgsql.sh

Go to the postgresql-10.0 directory you just unzipped and execute the following command:


./configure  - prefix=/apps/svr/pgsql_10.0/ --with-pgport=1921

Then compile and install:


gmake

gmake install

Once the installation is complete, we can confirm that the installation was successful by using the following command:


$ postgres --version

postgres (PostgreSQL) 10.0

The replication capability deploys

Before starting the database service to set up the master slave structure, there are a few important configuration files that we need to create and set up. They are

postgreql.conf pg_hba.conf recovery.conf .pgpass

Below we will explain the configuration of the above files in practice

In the last section, we compiled and installed postgresql, and then we switched the user


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
0

Then use the initdb tool to initialize the database:


echo "123456" >> /data/pg10/pgpass

initdb -D /data/pg10/pg_root -E UTF8 --locale=C -U postgres --pwfile=/data/pg10/pgpass

After the above command execution in/data/pg10 / pg_root data file directory will produce system,


PG_VERSION pg_dynshmem pg_multixact pg_snapshots pg_tblspc postgresql.auto.conf
base  pg_hba.conf pg_notify pg_stat pg_twophase postgresql.conf
global pg_ident.conf pg_replslot pg_stat_tmp pg_wal
pg_commit_ts pg_logical pg_serial pg_subtrans pg_xact

After we started configuration/data/pg10 / pg_root/postgresql conf, modify the following key items:


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
3

note: master library and library/data pg10 / pg_root/postgresql completely consistent

conf configuration

Next, we in the case of library configuration/data/pg10 / pg_root/pg_hba conf


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
4

In fact, it is better to have a copy of the master library, because the roles of the master library and the standby library are not static, in the case of manual or library failure, their roles will be replaced.

After that, we start the main pghost1 (remember to switch to postgres user) :


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
5

The super administrator postgres using PostgreSQL logs in to the create stream replication user repuser. The stream replication user needs to have REPLICATION permissions and LOGIN permissions


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
6

The above command basically completes the configuration on the main library. Next, we need to generate a backup library by hot standby. In the process of making the backup library, the main library can still be read and written without affecting the business


postgres=# select pg_start_backup('domacli_bak');

 pg_start_backup
-----------------
 0/2000060
(1 row)

The pg_start_backup() function initiates an online backup on the main library. After the command is executed, the data file is compressed and copied to the backup node :


$ tar czvf pg_root.tar.gz pg_root --exclude=pg_root/pg_wal

$ scp pg_root.tar.gz postgres@172.17.0.5:/data/pg10

The pg_wal directory is not required to be copied, you can exclude this directory to save space, then we go back to the backup /data/pg10, the main library backup file unzip:


wget https://ftp.postgresql.org/pub/source/v10.0/postgresql-10.0.tar.gz
9

After unzipping, we return to the primary node, execute the stop backup command, and end the backup process


postgres=# select pg_stop_backup();

NOTICE: pg_stop_backup complete, all required WAL segments have been archived
 pg_stop_backup
----------------
 0/2000168
(1 row)

Above to complete the online backup command said, but still need for library to do some configuration, we return to prepare library, configuration/data/pg10 / pg_root/recovery conf file, if the file does not exist, you can perform the following command in the software catalog copy a:


cp $PGHOME/share/recovery.conf.sample /data/pg10/pg_root/recovery.conf

Es307en.conf of the standby library is configured with the following parameter


recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.0.2 port=1921 user=repuser'

user=repuser in primary_conninfo, remember the stream transfer user repuser we created on the main library earlier? Since the direct data synchronization of main and backup needs to be performed under the user, when we created repuser on the main library, I set the password for security, but we did not configure the plaintext password for recovery.conf.

We recommend setting the password in ~/.pgpass:

You can also set primary_conninfo = 'host=172.17.0.2 port=1921 user=repuser password=domac123' directly on recovery


$ cd ~
$ touch .pgpass
$ chmod 0600 .pgpass

Fill in the following:


172.17.0.2:1921:replication:repuser:domac123
172.17.0.5:1921:replication:repuser:domac123

Well, once these notes are in place, we can start our backup library:


$ pg_ctl start

...

database system is ready to accept read only connections
 done
server started

If the standby starts normally, we can observe whether both the WAL occurrence and the receiving process are working at the same time on the primary and secondary libraries to confirm that the asynchronous flow is working properly

On the main library:


ps -ef | grep wal

postgres 6939 6935 0 23:16 ? 00:00:00 postgres: wal writer process
postgres 6983 6935 0 23:42 ? 00:00:00 postgres: wal sender process repuser 172.17.0.5(45910) streaming 0/3000140

On standby:


ps -ef | grep wal

postgres 26481 26479 0 23:42 ? 00:00:00 postgres: wal receiver process streaming 0/3000140
postgres 26486 26448 0 23:42 ? 00:00:00 grep --color=auto wal

deploys the flow replication
in pg_basebackup mode

Next, we introduce a relatively simple way to do this. The main steps involved in configuring the above operations are

pg_start_backup Copy pg_stop_backup of data between two servers

Above three steps can one step synthesis, PostgreSQL provide built-in pg_basebackup command line tools support for the main library launched an online backup, and automatic access to the backup mode for benchmark database backup, backup automatically after the completion of exit from the backup mode, do not need to perform additional pg_start_backup and pg_stop_backup commands explicitly declare into backup mode and exit backup mode, The pg_basebackup tool is a physical backup of the database instance level, so this tool is often used as a backup tool to benchmark the

database

The pg_basebackup tool requires either super user privileges or REPLICATION privileges to initiate a backup. Note the max_wal_senders parameter configuration, because the pg_basebackup tool consumes at least one WAL sending process. This section will demonstrate the deployment of asynchronous stream replication using the pg_basebackup tool. We have already deployed a backup library on pghost2. We will first delete the backup library, and then make a backup library again through the pg_basebackup tool

Enter pghost2 server (172.17.0.5)


$ pg_ctl stop -m fast
waiting for server to shut down.... done
server stopped

$ rm -rf $PGDATA
$ rm -rf /data/pg10/pg_tbs

Next, on pghost2, trigger the benchmark backup

using pg_basebackup

pg_basebackup -D $PGDATA -Fp -Xs -v -P -h 172.17.0.2 -p 1921 -U repuser -W

After execution, you see the associated log output


pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/20007A8 on timeline 1
pg_basebackup: starting background WAL receiver
22655/22655 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000888
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

From the above log information, it can be seen that the pg_basebackup command first makes an checkpoint to the database, and then makes a full library benchmark backup based on the time point. During the full backup process, it will copy $PGDATA data file and table space file to the corresponding directory

of the backup node

Finally, as before with pg_start_backup, the standby library remembers to configure recovery.conf


recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.0.2 port=1921 user=repuser password=domac123'

If the pgpass file is also configured, you can use a subordinate configuration:


recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.0.2 port=1921 user=repuser'

At this point, the configuration of the primary and secondary is almost complete, but to be on the safe side, we'd better get our hands dirty, try to create and insert data on the primary, see if these actions are synchronized on the secondary, and then create a table on the primary:


postgres=# create table test_ms(id int4);

CREATE TABLE

postgres=# insert into test_ms values(6);

INSERT 0 1

On the main library, we create the test_ms table and insert a piece of data so we can query on the backup library to see if the synchronization succeeded:


postgres=# select * from test_ms;
 id
----
 6
(1 row)

Next, let's go back to the main library and do


postgres=# insert into test_ms values(9);
INSERT 0 1

postgres=# delete from test_ms where id=6;
DELETE 1

At this point, we found that the backup data was also properly synchronized:


postgres=# select * from test_ms;
 id
----
 9
(1 row)

So what happens if we do data manipulation on the backup? Let's back up

again

postgres=# insert into test_ms values(6);

ERROR: cannot execute INSERT in a read-only transaction
STATEMENT: insert into test_ms values(6);
ERROR: cannot execute INSERT in a read-only transaction

Looking at these error logs, we can see that the backup library in the asynchronous flow master-slave structure, which ACTS as a slave node, is currently in a read-only state and cannot perform any write operations.

main/backup switch to

Introduces the flow in front of the copy the deployment, but should pay attention to the role of the master library and the library is not a static existence, in the process of maintenance can switch to the role of both, for example, when the main library hang up, need to quickly, main/backup for library upgrade is given priority to, its original owner relegated to prepare, library is the foundation of the high availability of PostgreSQL main/backup, the following will introduce related operations.

postgresql version 9.0 stream replication can only be switched between main and backup by creating a file. After 9.1, pg_ctl promote trigger mode is supported, which is more convenient than the file trigger mode

Before doing this, we introduce a system function called

, which is used to determine the primary and secondary roles

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

If returning f indicates that it is the primary library, returning t indicates that it is the standby library

pg_ctl promote switch mode

We use the following steps to switch between main and backup:

1. Close the main library. It is recommended to close

in the mode of -m fast

$ pg_ctl stop -m fast

2. Execute pg_ctl promote command on the backup to activate the backup. If recovery.conf becomes recovery.done, the backup has been switched to the main


vi /etc/profile.d/pgsql.sh
0

After the command is executed, if the original recovery.conf is changed to recovery.done,

is successfully switched

3. If you need to switch the old main library to the backup library, you can also create recovery.conf in the $PGDATA directory of the old main library (the creation method is the same as the previous introduction, the content can be the same as the original slave pghost2, but primary_conninfo IP instead of pghost2 IP)

For example, recovery.conf on the main library is set to:


vi /etc/profile.d/pgsql.sh
1

For more security, consider the following configuration:


vi /etc/profile.d/pgsql.sh
2

In the meantime, as with pghost2, we recommend that the password for repuser be set to pghost1 ~/.pgpass:


vi /etc/profile.d/pgsql.sh
3

Fill in the following:


vi /etc/profile.d/pgsql.sh
4

4. Start the old main library pghost1, then observe whether the main and standby processes are normal. Strictly point can operate on the new main library on the test_ms table just now, and observe whether the data synchronization is successful.


vi /etc/profile.d/pgsql.sh
5

We execute:

on the new main library (pghost2)

vi /etc/profile.d/pgsql.sh
6

Finding that its current role is already the main library, continue execution on the new standby (pghost1) :


vi /etc/profile.d/pgsql.sh
7

Found that its current role has also been switched to standby

We perform a data insert operation on pghost2:


vi /etc/profile.d/pgsql.sh
8

At this time, data synchronization success was also observed on pghost1:


vi /etc/profile.d/pgsql.sh
9

So far, the master-slave walkthrough is almost complete with

summary

Flow of asynchronous replication model, the main library committed transaction will not wait for library to receive WAL log flow and return confirmation, so the flow of asynchronous replication mode data version of the master library and library exists on a certain processing delay, delay the time of the main acceptor pressure, for the host library performance and network bandwidth, when under normal circumstances, the main preparation usually in the range of millisecond delay, when the main library goes down, the delay is mainly affected by fault found and switch time and stretched, but even so, these data delay problems, from the architecture or related automation operations means constantly optimized Settings.


Related articles: