diff --git a/docs/dcp/documentation/concepts.md b/docs/dcp/documentation/concepts.md index b0943d47a0..bee2e2f1a9 100644 --- a/docs/dcp/documentation/concepts.md +++ b/docs/dcp/documentation/concepts.md @@ -29,7 +29,7 @@ In Couchbase server, snapshot is an immutable copy of the key-value pairs in a v There are two types of snapshots that are formed when streaming items from the Couchbase vbuckets. When a client connects to the source, it initially gets a 'disk snapshot', and later when the client catches up with the source and hence has reached the steady state, it starts getting 'point-in-time' snapshots from memory. ### Disk Snapshots -Disk snapshots are immutable persistent snapshots on the disk. They are formed after a batch of items are written onto disk. These snapshots persist on the disk until multiple such snapshots are compacted into a single snapshot. The replication clients then pick up these immutable snapshots and hence get a view of the vbucket that is consistent with the source vbucket. This is called as disk backill. When a disk backfill starts, all the snapshots are logically merged as and sent over as a single snapshot to the client. For example, say the disk has 3 snapshots snp1 from 1 to 20, snp2 from 21 to 30 and snp3 from 31 to 60. A backfill request will merge all 3 snapshots and send over a single logical snapshot from 1 to 60. And for request from the middle of the snapshot, that if the request is from sequence number 15, then the logical snapshot sent over is 15 to 60. Note that, to get a consistent view of the vbucket, the client should read till the end of the snapshot as some of the keys might have been de-duplicated. +Disk snapshots are immutable persistent snapshots on the disk. They are formed after a batch of items are written onto disk. These snapshots persist on the disk until multiple such snapshots are compacted into a single snapshot. The replication clients then pick up these immutable snapshots and hence get a view of the vbucket that is consistent with the source vbucket. This is called as disk backfill. When a disk backfill starts, all the snapshots are logically merged as and sent over as a single snapshot to the client. For example, say the disk has 3 snapshots snp1 from 1 to 20, snp2 from 21 to 30 and snp3 from 31 to 60. A backfill request will merge all 3 snapshots and send over a single logical snapshot from 1 to 60. And for request from the middle of the snapshot, that if the request is from sequence number 15, then the logical snapshot sent over is 15 to 60. Note that, to get a consistent view of the vbucket, the client should read till the end of the snapshot as some of the keys might have been de-duplicated. A drawback of the disk snapshots is that the keys cannot be replicated until the snapshot is formed. They are good when the snapshots are fairly large, that is when replication clients picks up batch of items and when they are fine with the higher latency in being synced with the source.