Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

477 in multipart upload from aws-sdk-go [JIRA: RCS-363] #1314

Open
cbuben opened this issue May 25, 2016 · 1 comment
Open

477 in multipart upload from aws-sdk-go [JIRA: RCS-363] #1314

cbuben opened this issue May 25, 2016 · 1 comment

Comments

@cbuben
Copy link

cbuben commented May 25, 2016

Summary

Riak CS version 2.1.1.

This is a new variant of #490.

Multipart uploads to Riak CS from aws-sdk-go based clients fail with a 477 HTTP response. It appears that Riak CS cannot handle aws-sdk-go's chosen method of quoting etags in the CompleteMultipartUpload body.
#490 involves a client using aws-sdk-ruby, which uses &quot to quote the etag values in the CompleteMultipartUpload body; this blew up the CompleteMultipartUpload processing.

In #490 a fix was made in 5cba5e6 to handle &quot specifically.

This new issue involves a client using aws-sdk-go, which uses " to quote the etag value, and causes a similar 477 failure as originally seen in #490.

Example CompleteMultipartUpload POST from aws-sdk-go:

POST /some-bucket/foo?uploadId=z0D6DrTJSC2DjrF9Xe5W2w%3D%3D HTTP/1.1
Host: x.x.x.x:8080
User-Agent: aws-sdk-go/1.0.2 (go1.6; linux; amd64) S3Manager
Content-Length: 427
Authorization: AWS xxxxxxxxxxxxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxxxx
x-amz-date: Tue, 24 May 2016 19:28:25 UTC
Accept-Encoding: gzip

<CompleteMultipartUpload><Part><ETag>&#34;5f363e0e58a95f06cbe9bbc662c5dfb6&#34;</ETag><PartNumber>1</PartNumber></Part><Part><ETag>&#34;5f363e0e58a95f06cbe9bbc662c5dfb6&#34;</ETag><PartNumber>2</PartNumber></Part><Part><ETag>&#34;5f363e0e58a95f06cbe9bbc662c5dfb6&#34;</ETag><PartNumber>3</PartNumber></Part><Part><ETag>&#34;b6d81b360a5672d80c27430f39153e2c&#34;</ETag><PartNumber>4</PartNumber></Part></CompleteMultipartUpload>

Reproduction

My context for this problem: CloudFoundry BOSH director using https://github.com/pivotal-golang/s3cli to upload large files to Riak CS.

Build and use https://github.com/pivotal-golang/s3cli to upload a > 5MB file to Riak CS.

$ cat s3cli-riak 
{
  "signature_version": "2",
  "bucket_name": "some-bucket",
  "use_ssl": false,
  "host": "x.x.x.x",
  "port": 8080,
  "ssl_verify_peer": true,
  "credentials_source": "static",
  "access_key_id": "xxxxxxxx",
  "secret_access_key": "xxxxxxxx"
}

$ du -h foo
16M foo

$ s3cli -c s3cli-riak put foo foo
2016/05/25 17:03:12 performing operation put: 477InternalServerError: 477 Internal Server Error
    upload id: bG-UxormTqOM2VSFBZUESg==
@Basho-JIRA Basho-JIRA changed the title 477 in multipart upload from aws-sdk-go 477 in multipart upload from aws-sdk-go [JIRA: RCS-363] May 25, 2016
@cbuben
Copy link
Author

cbuben commented May 26, 2016

FWIW - I'm not even suggesting this is a proper fix (hence no PR), but the following hack does work around the problem:

diff --git a/src/riak_cs_wm_object_upload_part.erl b/src/riak_cs_wm_object_upload_part.erl
index 174733c..332c4d9 100644
--- a/src/riak_cs_wm_object_upload_part.erl
+++ b/src/riak_cs_wm_object_upload_part.erl
@@ -195,7 +195,7 @@ content_types_accepted(RD, Ctx) ->

 parse_body(Body0) ->
     try
-        Body = re:replace(Body0, "&quot;", "", [global, {return, list}]),
+        Body = re:replace(Body0, "&quot;|&#34;", "", [global, {return, list}]),
         {ok, ParsedData} = riak_cs_xml:scan(Body),
         #xmlElement{name='CompleteMultipartUpload'} = ParsedData,
         Nums = [list_to_integer(T#xmlText.value) ||

I am NO erlang/riak/riak-cs/xmerl wizard by any means, but some random observations:

The approach of preprocessing out these quotes prior to XML parsing seems really strange and brittle (this issue itself is an example of the brittleness). I'm not sure if the failure is due to 1) XML parsing not being able to deal with the quotes or 2) XML parsing is fine but the mere presence of the quotes in the etag values breaks later processing. So I'm not 100% clear on the intent of the preprocessing; that approach solves both 1 and 2, but I'm not sure which issue is causing the failure. If the issue is 1, then the question is "why isn't xmerl handling this valid XML?" If the issue is 2, postprocessing the etag values after XML processing seems apropos.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants