PostgreSQL流式復(fù)制入門經(jīng)驗分享！

糖果味的夏天 2021-09-24 17:48:19 瀏覽數(shù) (3770)

反饋

本篇文章中我們將深入探討了在 PostgreSQL 中設(shè)置流復(fù)制 (SR) 的具體細(xì)節(jié)。流式復(fù)制是在PostgreSQL 托管中實現(xiàn)高可用性的基本構(gòu)建塊，它是通過運行主從配置生成的。

主從術(shù)語

主/主服務(wù)器

可以進(jìn)行寫入的服務(wù)器。
也稱為讀/寫服務(wù)器。

從/備用服務(wù)器

數(shù)據(jù)與主服務(wù)器持續(xù)保持同步的服務(wù)器。
也稱為備份服務(wù)器或副本。
暖備用服務(wù)器是在提升為主服務(wù)器之前無法連接的服務(wù)器。
相比之下，熱備服務(wù)器可以接受連接并提供只讀查詢。在接下來的討論中，我們將只關(guān)注熱備服務(wù)器。

數(shù)據(jù)寫入主服務(wù)器并傳播到從服務(wù)器。如果現(xiàn)有主服務(wù)器出現(xiàn)問題，其中一臺從服務(wù)器將接管并繼續(xù)寫入以確保系統(tǒng)的可用性。

WAL 基于運輸?shù)膹?fù)制

什么是 WAL？

WAL 代表Write-Ahead Logging。
它是一個日志文件，所有對數(shù)據(jù)庫的修改在應(yīng)用/寫入數(shù)據(jù)文件之前都會寫入其中。
WAL 用于數(shù)據(jù)庫崩潰后的恢復(fù)，確保數(shù)據(jù)完整性。
WAL 用于數(shù)據(jù)庫系統(tǒng)以實現(xiàn)原子性和持久性。

WAL 如何用于復(fù)制？

預(yù)寫日志記錄用于保持?jǐn)?shù)據(jù)庫服務(wù)器之間的數(shù)據(jù)同步。這是通過兩種方式實現(xiàn)的：

基于文件的日志傳送

WAL 日志文件從主服務(wù)器傳送到備用服務(wù)器以保持?jǐn)?shù)據(jù)同步。
Master可以直接將日志拷貝到備服務(wù)器存儲，也可以與備服務(wù)器共享存儲。
一個 WAL 日志文件最多可以包含 16MB 的數(shù)據(jù)。
WAL 文件僅在達(dá)到該閾值后才會發(fā)送。
這將導(dǎo)致復(fù)制延遲，并且如果主服務(wù)器崩潰且日志未歸檔，也會增加丟失數(shù)據(jù)的機(jī)會。

流式 WAL 記錄

WAL 記錄塊由數(shù)據(jù)庫服務(wù)器流式傳輸以保持?jǐn)?shù)據(jù)同步。
備用服務(wù)器連接到主服務(wù)器以接收 WAL 塊。
WAL 記錄在生成時進(jìn)行流式傳輸。
WAL 記錄的流式傳輸不需要等待 WAL 文件被填充。
與基于文件的日志傳送相比，這允許備用服務(wù)器保持更新。
默認(rèn)情況下，流復(fù)制是異步的，即使它也支持同步復(fù)制。

這兩種方法各有優(yōu)缺點。使用基于文件的傳送可實現(xiàn)時間點恢復(fù)和連續(xù)歸檔，而流可確保備用服務(wù)器上的數(shù)據(jù)即時可用。但是，您可以將 PostgreSQL 配置為同時使用這兩種方法并享受好處。在這篇文章中，我們主要關(guān)注流復(fù)制以實現(xiàn) PostgreSQL 高可用性。

如何設(shè)置流式復(fù)制

在 PostgreSQL 中設(shè)置流式復(fù)制非常簡單。假設(shè)所有服務(wù)器上都已經(jīng)安裝了 PostgreSQL，您可以按照以下步驟開始：

主節(jié)點上的配置

使用該initdb實用程序在主節(jié)點上初始化數(shù)據(jù)庫。
通過運行以下命令創(chuàng)建具有復(fù)制權(quán)限的角色/用戶。運行該命令后，您可以通過運行\(zhòng)du 以在 psql 上列出它們來驗證它。CREATE USER <user_name> REPLICATION LOGIN ENCRYPTED PASSWORD '<password>';
在主 PostgreSQL 配置 (postgresql.conf) 文件中配置與流式復(fù)制相關(guān)的屬性：

# Possible values are replica|minimal|logical
wal_level = replica
# required for pg_rewind capability when standby goes out of sync with master
wal_log_hints = on
# sets the maximum number of concurrent connections from the standby servers.
max_wal_senders = 3
# The below parameter is used to tell the master to keep the minimum number of
# segments of WAL logs so that they are not deleted before standby consumes them.
# each segment is 16MB
wal_keep_segments = 8
# The below parameter enables read only connection on the node when it is in
# standby role. This is ignored when the server is running as master.
hot_standby = on

在 pg_hba.conf 文件中添加復(fù)制條目以允許服務(wù)器之間的復(fù)制連接：

# Allow replication connections from localhost,
# by a user with the replication privilege.
# TYPE DATABASE USER ADDRESS METHOD
host replication repl_user IPaddress(CIDR) md5

在主節(jié)點上重啟 PostgreSQL 服務(wù)以使更改生效。

備用節(jié)點上的配置

使用該pg_basebackup 實用程序創(chuàng)建主節(jié)點的基本備份，并將其用作備用節(jié)點的起點。

# Explaining a few options used for pg_basebackup utility
# -X option is used to include the required transaction log files (WAL files) in the
# backup. When you specify stream, this will open a second connection to the server
# and start streaming the transaction log at the same time as running the backup.
# -c is the checkpoint option. Setting it to fast will force the checkpoint to be
# created soon.
# -W forces pg_basebackup to prompt for a password before connecting
# to a database.
pg_basebackup -D <data_directory> -h <master_host> -X stream -c fast -U repl_user -W

如果不存在，則創(chuàng)建復(fù)制配置文件（如果 pg_basebackup 中提供了 -R 選項，則會自動創(chuàng)建）：

# Specifies whether to start the server as a standby. In streaming replication,
# this parameter must be set to on.
standby_mode = ‘on’
# Specifies a connection string which is used for the standby server to connect
# with the primary/master.
primary_conninfo = ‘host=<master_host> port=<postgres_port> user=<replication_user> password=<password> application_name=”host_name”’
# Specifies recovering to a particular timeline. The default is to recover along the
# same timeline that was current when the base backup was taken.
# Setting this to latest recovers to the latest timeline found
# in the archive, which is useful in a standby server.
recovery_target_timeline = ‘latest’

啟動待機(jī)。

備用配置必須在所有備用服務(wù)器上完成。配置完成并啟動備用服務(wù)器后，它將連接到主服務(wù)器并開始流式傳輸日志。這將設(shè)置復(fù)制并可以通過運行 SQL 語句進(jìn)行驗證 SELECT * FROM pg_stat_replication;。

默認(rèn)情況下，流式復(fù)制是異步的。如果您希望使其同步，則可以使用以下參數(shù)對其進(jìn)行配置：

# num_sync is the number of synchronous standbys from which transactions
# need to wait for replies.
# standby_name is same as application_name value in recovery.conf
# If all standby servers have to be considered for synchronous then set value ‘*’
# If only specific standby servers needs to be considered, then specify them as
# comma-separated list of standby_name.
# The name of a standby server for this purpose is the application_name setting of the
# standby, as set in the primary_conninfo of the
# standby’s WAL receiver.
synchronous_standby_names = ‘num_sync ( standby_name [, ...] )’

Synchronous_commit 必須為同步復(fù)制設(shè)置，這是默認(rèn)設(shè)置。PostgreSQL 為同步提交提供了非常靈活的選項，并且可以在用戶/數(shù)據(jù)庫級別進(jìn)行配置。有效值如下：

Off - 甚至在該事務(wù)記錄實際刷新到該節(jié)點上的 WAL 日志文件之前，事務(wù)提交已向客戶端確認(rèn)。
本地- 只有在將該事務(wù)記錄刷新到該節(jié)點上的 WAL 日志文件后，才會向客戶端確認(rèn)事務(wù)提交。
Remote_write – 只有在由指定的服務(wù)器synchronous_standby_names 確認(rèn)事務(wù)記錄已寫入磁盤緩存后，才向客戶端確認(rèn)事務(wù)提交，但不一定在刷新到 WAL 日志文件后。
On – 只有在指定的服務(wù)器synchronous_standby_names 確認(rèn)事務(wù)記錄刷新到 WAL 日志文件后，才會向客戶端確認(rèn)事務(wù)提交。
Remote_apply – 只有在由指定的服務(wù)器synchronous_standby_names 確認(rèn)事務(wù)記錄已刷新到 WAL 日志文件并將其應(yīng)用于數(shù)據(jù)庫后，才會向客戶端確認(rèn)事務(wù)提交。

synchronous_commit 在同步復(fù)制模式下設(shè)置為 off 或 local 會使其像異步一樣工作，并且可以幫助您獲得更好的寫入性能。但是，這會增加備用服務(wù)器上的數(shù)據(jù)丟失和讀取延遲的風(fēng)險。如果設(shè)置為 remote_apply，它將確保備用服務(wù)器上的數(shù)據(jù)立即可用，但寫入性能可能會降低，因為它應(yīng)該應(yīng)用于所有/提到的備用服務(wù)器。

如果您計劃使用連續(xù)存檔和時間點恢復(fù)，則可以啟用存檔模式。雖然流式復(fù)制不是強制性的，但啟用存檔模式有額外的好處。如果歸檔模式未開啟，那么我們需要使用復(fù)制槽功能或確保根據(jù)負(fù)載將wal_keep_segments值設(shè)置得足夠高。

SQL

0 人點贊

PostgreSQL流式復(fù)制入門經(jīng)驗分享！