-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: use ScanRequest instead of ExportRequest during backfills #44663
changefeedccl: use ScanRequest instead of ExportRequest during backfills #44663
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the roachtests are happy, i'm happy. cdc/initial-scan is designed to stress exactly this, i'd be curious how long it takes before and after this change
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/ccl/changefeedccl/poller.go, line 453 at r1 (raw file):
ctx context.Context, span roachpb.Span, ts hlc.Timestamp, withDiff bool, ) error { txn := p.db.NewTxn(ctx, "poller backfill")
nit: i'd think "changefeed backfill" would be more evocative
pkg/ccl/changefeedccl/poller.go, line 478 at r1 (raw file):
afterScan := timeutil.Now() res := b.RawResponse().Responses[0].GetScan() for _, br := range res.BatchResponses {
i'd still keep the decode+AddKV part as a separate method for readability but ymmv
pkg/ccl/changefeedccl/poller.go, line 488 at r1 (raw file):
var prevVal roachpb.Value if withDiff { prevVal = kv.Value
looks like you lost a (somewhat important) comment here
cd5f1c0
to
7c594dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it with the 8M row limit and got roughly the same as last night. Running it again. I'm 99% sure the sink is the bottleneck. I accidentally for one run ran it where it didn't call slurp and the whole scan took 6s. I think it's all sort of in the noise. It's possible that the scan at 1<<18
rows is a hair slower but it doesn't seem statistically significant. Anyway, I intend to make it use something like 16MB when we get the size limit at which point it should be better.
I think I'd like to merge this given it doesn't seem egregiously worse.
export: 8m12.375818885s
export: 9m43.823838146s
export: 9m17.891210167s
export: 8m34.513046008s
8<<20: 8m37.686091157s
1<<20: 9m3.046564286s
1<<18: 8m28.071550859s
1<<18: 9m49.689186991s
1<<18: 11m20.302953236s
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/ccl/changefeedccl/poller.go, line 453 at r1 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
nit: i'd think "changefeed backfill" would be more evocative
Done.
pkg/ccl/changefeedccl/poller.go, line 478 at r1 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
i'd still keep the decode+AddKV part as a separate method for readability but ymmv
Done.
pkg/ccl/changefeedccl/poller.go, line 488 at r1 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
looks like you lost a (somewhat important) comment here
Done.
This PR is motivated by the desire to get the memory usage of CDC under control in the presense of much larger ranges. Currently when a changefeed decides it needs to do a backfill, it breaks the spans up along range boundaries and then fetches the data (with some parallelism) for the backfill. The memory overhead was somewhat bounded by the range size. If we want to make the range size dramatically larger, the memory usage would become a function of that new, much larger range size. Fortunately, we don't have much need for these `ExportRequest`s any more. Another fascinating revelation of late is that the `ScanResponse` does indeed include MVCC timestamps (not the we necessarily needed them but it's a good idea to keep them for compatibility). The `ScanRequest` permits currently a limit on `NumRows` which this commit utilized. I wanted to get this change typed in anticipation of cockroachdb#44341 which will provide a limit on `NumBytes`. I retained the existing parallelism as ScanRequests with limits are not parallel. I would like to do some benchmarking but I feel pretty okay about the testing we have in place already. @danhhz what do you want to see here? Relates to cockroachdb#39717. Release note: None.
7c594dc
to
78a3ee8
Compare
As long as the performance is in the same ballpark, it's fine by me. Those numbers look okay, merge away! |
Flaked on #41941 |
TFTR! bors r=danhhz |
44663: changefeedccl: use ScanRequest instead of ExportRequest during backfills r=danhhz a=ajwerner This PR is motivated by the desire to get the memory usage of CDC under control in the presense of much larger ranges. Currently when a changefeed decides it needs to do a backfill, it breaks the spans up along range boundaries and then fetches the data (with some parallelism) for the backfill. The memory overhead was somewhat bounded by the range size. If we want to make the range size dramatically larger, the memory usage would become a function of that new, much larger range size. Fortunately, we don't have much need for these `ExportRequest`s any more. Another fascinating revelation of late is that the `ScanResponse` does indeed include MVCC timestamps (not the we necessarily needed them but it's a good idea to keep them for compatibility). The `ScanRequest` permits currently a limit on `NumRows` which this commit utilized. I wanted to get this change typed in anticipation of #44341 which will provide a limit on `NumBytes`. I retained the existing parallelism as ScanRequests with limits are not parallel. I would like to do some benchmarking but I feel pretty okay about the testing we have in place already. @danhhz what do you want to see here? Relates to #39717. Release note: None. 44719: sql: add telemetry for uses of alter primary key r=otan a=rohany Fixes #44716. This PR adds a telemetry counter for uses of the alter primary key command. Release note (sql change): This PR adds collected telemetry from clusters upon using the alter primary key command. 44721: vendor: bump pebble to 89adc50375ffd11c8e62f46f1a5c320012cffafe r=petermattis a=petermattis * db: additional tweak to the sstable boundary generation * db: add memTable.logSeqNum * db: force flushing of overlapping queued memtables during ingestion * tool: lsm visualization tool * db: consistently handle key decoding failure * cmd/pebble: fix lint for maxOpsPerSec's type inference * tool: add "find" command * cmd/pebble: fix --rate flag * internal/metamorphic: use the strict MemFS and add an operation to reset the DB * db: make DB.Close() wait for any ongoing deletion of obsolete files * sstable: encode varints directly into buf in blockWriter.store * sstable: micro-optimize Writer.addPoint() * sstable: minor cleanup of Writer/blockWriter * sstable: micro-optimize blockWriter.store Fixes #44631 Release note: None Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com> Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu> Co-authored-by: Peter Mattis <petermattis@gmail.com>
Build succeeded |
This PR is motivated by the desire to get the memory usage of CDC under
control in the presense of much larger ranges. Currently when a changefeed
decides it needs to do a backfill, it breaks the spans up along range
boundaries and then fetches the data (with some parallelism) for the backfill.
The memory overhead was somewhat bounded by the range size. If we want to make
the range size dramatically larger, the memory usage would become a function of
that new, much larger range size.
Fortunately, we don't have much need for these
ExportRequest
s any more.Another fascinating revelation of late is that the
ScanResponse
does indeedinclude MVCC timestamps (not the we necessarily needed them but it's a good
idea to keep them for compatibility).
The
ScanRequest
permits currently a limit onNumRows
which this commitutilized. I wanted to get this change typed in anticipation of #44341 which
will provide a limit on
NumBytes
.I retained the existing parallelism as ScanRequests with limits are not
parallel.
I would like to do some benchmarking but I feel pretty okay about the
testing we have in place already. @danhhz what do you want to see here?
Relates to #39717.
Release note: None.