refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, AppendRows() on a managed stream would return one
AppendResult per data row. This change instead switches the
behavior to return a single AppendResult for tracking the behavior
of the set of rows.
The original per-row contract was done in expectation that we'd
consider making batching decisions at a very granular level. However,
at this point it seems reasonable to consider only batching multiple
appends, not dividing individual batches more granularly.
From stress testing, we know we're able to push 50k RPS on a single stream
currently, issuing batches at roughly 1.5 batches per sec. This change means
that rather than creating 50k AppendResults per sec (and associated channels etc)
that we're instead creating only a single appendresult every 1.5 secs. Early numbers
indicates that this change improves stresstest throughput by 5-15% depending on the metric
you're using (rows/batches/bytes), in addition to compute/memory overhead savings.
BREAKING CHANGE: managedwriter AppendRows now returns a single AppendResponse for the whole append rather than one per row.