Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle partial write update from GCS #781

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mayanks
Copy link
Contributor

@mayanks mayanks commented May 12, 2022

Connector supports resuming upload only from MAX_BYTES_PER_MESSAGE
boundary. If GCS reports a committed offset which is not a multiple
of this number, then we cannot resume.

Connector supports resuming upload only from MAX_BYTES_PER_MESSAGE
boundary. If GCS reports a committed offset which is not a multiple
of this number, then we cannot resume.
@mayanks
Copy link
Contributor Author

mayanks commented May 12, 2022

/gcbrun

@codecov
Copy link

codecov bot commented May 12, 2022

Codecov Report

Merging #781 (71ecf70) into master (d465885) will decrease coverage by 18.33%.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             master     #781       +/-   ##
=============================================
- Coverage     81.23%   62.89%   -18.34%     
+ Complexity     1978     1446      -532     
=============================================
  Files           133      133               
  Lines          8818     8822        +4     
  Branches       1025     1026        +1     
=============================================
- Hits           7163     5549     -1614     
- Misses         1240     2735     +1495     
- Partials        415      538      +123     
Flag Coverage Δ
integrationtest 62.89% <0.00%> (-0.15%) ⬇️
unittest ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...doop/gcsio/GoogleCloudStorageGrpcWriteChannel.java 64.46% <0.00%> (-15.09%) ⬇️
...a/com/google/cloud/hadoop/util/AccessBoundary.java 0.00% <0.00%> (-100.00%) ⬇️
...om/google/cloud/hadoop/util/IoExceptionHelper.java 0.00% <0.00%> (-100.00%) ⬇️
...le/cloud/hadoop/io/bigquery/ShardedInputSplit.java 0.00% <0.00%> (-100.00%) ⬇️
...d/hadoop/gcsio/testing/GcsItemInfoTestBuilder.java 0.00% <0.00%> (-100.00%) ⬇️
...bigquery/output/FederatedBigQueryOutputFormat.java 0.00% <0.00%> (-100.00%) ⬇️
...query/output/FederatedBigQueryOutputCommitter.java 0.00% <0.00%> (-100.00%) ⬇️
...gcsio/StorageRequestToAccessBoundaryConverter.java 0.00% <0.00%> (-98.77%) ⬇️
...d/hadoop/util/testing/MockHttpTransportHelper.java 0.00% <0.00%> (-96.67%) ⬇️
...adoop/io/bigquery/DynamicFileListRecordReader.java 0.00% <0.00%> (-93.60%) ⬇️
... and 70 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d465885...71ecf70. Read the comment docs.

@@ -220,6 +220,13 @@ private WriteObjectResponse doResumableUpload() throws IOException {
// Only request committed size for the first insert request.
if (writeOffset > 0) {
writeOffset = getCommittedWriteSizeWithRetries(uploadId);
if (writeOffset % MAX_BYTES_PER_MESSAGE != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests to cover the code path.

Could we get a confirmation from GCS team on the behaviour as well ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider this scenario

  • Last chunk of data was writted and committed successfully, but the ack response errored (may be network or client side error)

This is a valid scenario where the commitOffset isn't a multiple of chunk size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@veblush In one run of the benchmark we noticed this behavior. Towards the end of the file upload we received a transient error. On trying to resume, we fetched the committedWriteOffset in GCS. It returned a number which was not a multiple of 2MB(MAX_BYTES_PER_MESSAGE). Since we store uncommitted buffers in 2MB, we were not able to resume from that offset.

Can you confirm this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider this scenario

In this case the complete file is written to the GCS. So when we retry,

  1. We get the file length as the committed Offset.
  2. We don't have a chunk beginning at that in our buffer, so
  3. We end up trying to read from the pipe, which returns EOF (0).
  4. This results in uploading finalized chunk to GCS with size as 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants