refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729

shollyman · 2021-09-07T17:32:50Z

Previously, AppendRows() on a managed stream would return one
AppendResult per data row. This change instead switches the
behavior to return a single AppendResult for tracking the behavior
of the set of rows.

The original per-row contract was done in expectation that we'd
consider making batching decisions at a very granular level. However,
at this point it seems reasonable to consider only batching multiple
appends, not dividing individual batches more granularly.

From stress testing, we know we're able to push 50k RPS on a single stream
currently, issuing batches at roughly 1.5 batches per sec. This change means
that rather than creating 50k AppendResults per sec (and associated channels etc)
that we're instead creating only a single appendresult every 1.5 secs. Early numbers
indicates that this change improves stresstest throughput by 5-15% depending on the metric
you're using (rows/batches/bytes), in addition to compute/memory overhead savings.

BREAKING CHANGE: managedwriter AppendRows now returns a single AppendResponse for the whole append rather than one per row.

behavior Previously, AppendRows() on a managed stream would return one AppendResult per data row. This change instead switches the behavior to return a single AppendResult for tracking the behavior of the set of rows. The original per-row contract was done in expectation that we'd consider making batching decisions are a very granular level. However, at this point it seems reasonable to consider only batching multiple appends, not dividing individual batchs more granularly.

shollyman added 2 commits September 7, 2021 17:02

Merge branch 'master' into single-appendresult

aa33765

shollyman requested a review from a team September 7, 2021 17:32

shollyman requested a review from a team as a code owner September 7, 2021 17:32

shollyman requested a review from tswast September 7, 2021 17:32

google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Sep 7, 2021

product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Sep 7, 2021

shollyman requested a review from codyoss September 7, 2021 17:33

tswast approved these changes Sep 7, 2021

View reviewed changes

shollyman changed the title ~~BREAKING CHANGE(bigquery/storage/managedwriter): change AppendRows behavior~~ refactor(bigquery/storage/managedwriter): change AppendRows behavior Sep 7, 2021

shollyman merged commit 9c9fbb2 into googleapis:master Sep 8, 2021

shollyman deleted the single-appendresult branch September 8, 2021 16:13

shollyman mentioned this pull request Sep 9, 2021

bigquery: build veneer for bigquery write client #4366

Closed

shollyman mentioned this pull request Sep 23, 2021

chore: release bigquery 1.23.0 #4738

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729

refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729

shollyman commented Sep 7, 2021 •

edited

Loading

refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729

refactor(bigquery/storage/managedwriter): change AppendRows behavior #4729

Conversation

shollyman commented Sep 7, 2021 • edited Loading

shollyman commented Sep 7, 2021 •

edited

Loading