Inside logical replication in PostgreSQL: How it works

Inside logical replication in PostgreSQL: How it works

I was honored to talk about the ins and outs of logical replication with an audience of PostgreSQL enthusiasts at PGConf India 2023. If you missed the presentation, let’s review it all together.

Logical replication allows fine-grained control over both data replication and security. In this blog, I’ll review the fundamentals of Logical Replication and some use cases.

The paper I wrote on Internals of Logical Replication was one of the 27 CFPs chosen among 120 applications. In the course of the conference, I discussed the following subjects:

Contents

Introduction

Use cases

Architecture

Publication

Subscription

Processes

Replicating incremental changes

Worker failure handling is an important aspect to consider.

Changes to an existing subscription

How is synchronous_commit achieved?

Replication slot

Row filters

Column lists

Benefits of column filters and row lists

Replicating tables in schema

More information about Additional reading

Introduction

Logical replication is a method of replicating data changes from publisher to subscriber. The node where a publication is defined is referred to as the publisher. The node where a subscription is defined is referred to as the subscriber. Logical replication allows fine-grained control over both data replication and security.

Logical replication is a subscribe and publish model with some or all subscribers subscribed to one or more magazines published by a publisher on a node. Subscribers collect data from the journals they subscribe to and can then re-publish the data to enable cascading replication or more complicated configurations.

Use cases

  • Making incremental changes to the same database or part of one database to users in the order they happen.
  • Individual differences can trigger a firing trigger when the subscriber receives them.
  • Combining multiple databases into one (e.g., for analysis reasons). Replication of different variations of PostgreSQL.
  • Replicates between PostgreSQL instances across diverse operating systems (e.g., Linux to Windows).
  • The provision of duplicated data to various types of users.
  • Sharing a portion of the database across several databases.

Architecture

Below, I will show how logical replication functions within PostgreSQL 15. I’ll return to this diagram in the course of this blog post.

Publication

Publications are defined by the primary node on which changes are replicated. Publications are a collection of changes generated by one table or a set of tables. It can also be referred to as a change or replication set. Each publication can be found in only one database.

Each table can be added to several publications in the event of need. Publications can currently contain only tables and all tables that are in the schema.

Publications may restrict the changes they create in any combination of INSERT UPDATE, DELETE, or TRUNCATE. This is similar to the way certain types of events fire triggers. As a default setting, every kind of operation is replicated.

If a publication is made, the information about the magazine is added to the pg_publication catalog table:

Subscription

Subscriptions are the downstream part of the logical replication. It is the basis for connecting to a different database and collecting publications (one or more) to which it wishes to subscribe.

The subscriber database operates similarly to the other PostgreSQL instance and can serve as a publisher of other databases by creating the publication it owns. A subscriber’s node can contain several subscriptions. It is possible to establish multiple subscriptions for the same publisher and subscriber pair, in which case it is essential to ensure that subscribed published publication objects don’t cross.

Each subscription will be able to receive changes through one slot for replication. Additional spaces for reproduction may be required to begin the initial synchronization of existing table data. This will be removed after the data sync.

If a subscription is made, the information about the subscription is added to the catalog table for pg_subscription:

postgres=# CREATE SUBSCRIPTION sub_alltables

CONNECTION ‘dbname=postgres host=localhost port=5432’

PUBLICATION pub_alltables;

NOTE: A created slot for replication “sub_alltables” on the publisher

CREATE SUBSCRIPTION

postgres=# SEARCH OID, subbed, name, subconninfo, and sub publications from pg_subscription.

OID | subconninfo | Subpublications

——-+———+——————+——————————————+—————–

16393 | 5 | sub_alltables | dbname=postgres host=localhost port=5432 |

(1 row)

The subscriber will be able to connect with the publisher and receive the table list that the publishers publish. In our previous example, we created pub_alltables that publish all table data and publication relations. The publication relations are added to the catalog of subscription tables:

postgres=# SELECT stupid, srerelid::regclass FROM pg_subscription_rel;

Srsubid

———+———

16399 | accounts

16399 | accounts_roles

16399 | roles

16399 | department

16399 | employee

(5 rows)

Subscribers connect to publishers and then create the replication slot. The details are available in the pg_replication_slots

 

Leave a Reply