Doubt regarding Neon Internals ( Creating a NeonWal record )

Narendra_Pradeep · September 21, 2023, 9:32am

Recently, I was going through the repository and had some doubts about the code or its purpose. Not sure if my understanding is right. I hope to get some help from the community.

So my doubt is while ingesting the record in PageServer.

      for blk in decoded.blocks.iter() {
            self.ingest_decoded_block(modification, lsn, decoded, blk, ctx)
                .await?;
        }

In ingest_decoded_block function we are creating and adding a NeonWalRecord record for each blk. So NeonWalRecord for each block contains the same Xlogrecord in its rec(bytes). The same is written into the layerfile during datamodification.commit.

This seems like a bloat to me if the xlogrecord contains numerous blocks. If I’m wrong and this has a function, please correct me.

Matthias · September 21, 2023, 9:43am

Correct, the storage layer stores each record as many times as it has block references, for those records that include block references.

However, as you can see, we iterate over the blocks, and ingest the wal record once for each block with the block identity included. The storage layer stores all records that modify a block near eachother. This makes recovering a block’s state (for point-in-time recovery, and branching) very cheap: all changes to a block can be read sequentially. If we instead only kept one copy of the WAL record around, the access pattern would be much more random, and performance would be decreased.

We rely on the (usually quite accurate) assumption that the records generally only modify one (or two) pages to limit the duplication factor.