> Every time an on-disk database page (4KB) needs to be modified by a write operation, even just a single byte, a copy of the entire page, edited with the requested changes, is written to the write-ahead log (WAL). Physical streaming replication leverages this existing WAL infrastructure as a log of changes it streams to replicas.
First, the PostgreSQL page size is 8KB and has been that since the beginning.
The remaining part. According to PostgreSQL documentation[1] (on full page writes which decides if those are made), a copy of the entire page is only written fully to the WAL after the first modification of that page since the last checkpoint. Subsequent modifications will not result in full page writes to the WAL. So if you update a counter 3 times in sequence you won't get 3*8KB written to the WAL, instead you would get a single page dump and the remaining two would only log the row-level change which is much smaller[2]. This is further reduced by WAL compression[3] (reducing the segment usage) and by increasing the checkpointing interval which would reduce the amount of copies happening[4].
This irked me because it sounded like whatever you touch produces an 8KB copy of data and it seems to not be the case.
That is correct. And neither is a full page write logged if the page is initialized from scratch. And even without WAL compression, the "hole" in the middle of the page if the page is not full, is "compressed" out.
That's not to say that FPWs are not a problem. The increase in WAL volume they can cause can be seriously problematic.
One interesting thing is that they actually can often very significantly increase streaming replication / crash recovery performance. When replaying the incremental records the page needs to be read from the os/disk if the page is not in the postgres' page cache. But with FPWs we can seed the page cache contents with the page image. For the pretty common case where the number of pages written between two checkpoints fits into the cache, that can be a very serious performance advantage.
First, the PostgreSQL page size is 8KB and has been that since the beginning.
The remaining part. According to PostgreSQL documentation[1] (on full page writes which decides if those are made), a copy of the entire page is only written fully to the WAL after the first modification of that page since the last checkpoint. Subsequent modifications will not result in full page writes to the WAL. So if you update a counter 3 times in sequence you won't get 3*8KB written to the WAL, instead you would get a single page dump and the remaining two would only log the row-level change which is much smaller[2]. This is further reduced by WAL compression[3] (reducing the segment usage) and by increasing the checkpointing interval which would reduce the amount of copies happening[4].
This irked me because it sounded like whatever you touch produces an 8KB copy of data and it seems to not be the case.
[1] - https://www.postgresql.org/docs/11/runtime-config-wal.html#G...
[2] - http://www.interdb.jp/pg/pgsql09.html
[3] - https://www.postgresql.org/docs/11/runtime-config-wal.html#G...
[4] - https://www.postgresql.org/docs/11/runtime-config-wal.html#G...