Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3manager Uploader returns URL encoded Location for bigger files (multipart upload) #1385

Closed
mtestrot opened this issue Jul 3, 2017 · 15 comments
Labels
feature-request A feature should be added or improved.

Comments

@mtestrot
Copy link

mtestrot commented Jul 3, 2017

Please fill out the sections below to help us address your issue.

Version of AWS SDK for Go?

v1.10.6

Version of Go (go version)?

go1.8 windows/amd64

What issue did you see?

The upload function from the s3manager returns an URL encoded location (UploadOutput.location) if a bigger file (>5MB) is uploaded. For smaller files (<5MB) the location is not URL encoded which in my opinion is the right behaviour.
I guess the behaviour is different for multipart uploads and singlepart uploads.

Steps to reproduce

// filesize = 5 * 1024 * 1024 Byte
uploader = s3manager.NewUploader(sess)
output, err := uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String("mybucket"),
Key: aws.String("1234"),
Body: bufio.NewReader(file),
})
fmt.Println(output.Location) // should NOT be URL encoded

@xibz xibz added the feature-request A feature should be added or improved. label Jul 3, 2017
@xibz
Copy link
Contributor

xibz commented Jul 3, 2017

Hello @mtestrot, thank you for reaching out to us. We are aware of this issue and currently have it in our backlog, #645. We are more than willing to take PRs too! I will go ahead and bring this up in our next team meeting. For now, I'll mark this as a feature request.

@mtestrot
Copy link
Author

mtestrot commented Jul 4, 2017

Ok. Thanks for your quick reply. I didn't find the existing issue, so sorry for posting it again.

@xibz
Copy link
Contributor

xibz commented Jul 5, 2017

@mtestrot - What is your use case for the Location field? Are you doing anything with that field?

@mtestrot
Copy link
Author

mtestrot commented Jul 6, 2017

I store and pass around the location as a reference to retrieve the file later on. So everything the using application has to know is the reference in order to Http GET the object. For the using application It's unnecessary to know the construction rule for the url and it's unnecessary to know that the object is stored in S3.

@xibz
Copy link
Contributor

xibz commented Jul 6, 2017

@mtestrot - Other than the inconsistency of the escaping, does the URL work?

@mtestrot
Copy link
Author

mtestrot commented Jul 7, 2017

Yes. The URL works as soon as we do a string replace before passing it around:
reference = strings.Replace(output.Location,"%2F","/",-1)

Of course there will be a Problem if the URL contains a part which has to be URL encoded. For example if a path-segment from the S3 Object Key is build from a URL. But this is currently no use case for us. So the workaround works.

@xibz
Copy link
Contributor

xibz commented Jul 7, 2017

@mtestrot - Thank you for the information. I will go ahead and try to reproduce this on my end as well. Can you provide a code snippet trying to reuse the Location value? In particular, the section that makes the request and the section that replaces the %2F.

@mtestrot
Copy link
Author

Hi,

I gathered some code for an isolated test case:

package awsissue

import (
	"os"
	"time"
	"fmt"
	"testing"
	"io/ioutil"
	"log"
	"math/rand"
	"github.com/aws/aws-sdk-go/aws"
	"bufio"
	"github.com/aws/aws-sdk-go/service/s3/s3manager"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/aws/signer/v4"
	"net/http"
)

var (
	sess     = session.Must(session.NewSession())
	uploader = s3manager.NewUploader(sess)
)

func NewTempFile(size int, parentDir string, blockSize int) string {
	tmpfile, err := ioutil.TempFile(parentDir, "test")
	if err != nil {
		log.Fatal(err)
	}

	b := make([]byte, blockSize)
	for i := 0; i < size; i++ {
		for x := 0; x < blockSize; x++ {
			b[x] = byte(rand.Int())
		}

		if _, err := tmpfile.Write(b); err != nil {
			log.Fatal(err)
		}
	}
	if err := tmpfile.Close(); err != nil {
		log.Fatal(err)
	}

	return tmpfile.Name()
}

func store(filepath string) (string, error) {
	file, err := os.Open(filepath)
	defer file.Close()
	if err != nil {
		return "", err
	}

	output, err := uploader.Upload(&s3manager.UploadInput{
		Bucket: aws.String("bucket"),
		Key:    aws.String(fmt.Sprintf("a/b/%d", time.Now().UnixNano())),
		Body:   bufio.NewReader(file),
	})

	if err != nil {
		return "", err
	}

	reference := output.Location
	// Uncomment the following line to fix the issue
	//reference = strings.Replace(reference, "%2F", "/", -1) // Workaround for https://github.com/aws/aws-sdk-go/issues/1385
	return reference, nil
}

func TestLocation(t *testing.T) {
	// at least 5MB to reproduce the issue
	filepath := NewTempFile(5, "", 1024*1024)
	defer os.Remove(filepath)

	location, err := store(filepath)
	if err != nil {
		t.Fatal(err)
	}

	req, err := http.NewRequest("GET", location, nil)
	if err != nil {
		t.Fatal(err)
	}
	signer := v4.NewSigner(sess.Config.Credentials)
	_, err = signer.Sign(req, nil, "s3", *sess.Config.Region, time.Now())
	if err != nil {
		t.Fatal(err)
	}
	client := &http.Client{}
	resp, err := client.Do(req)
	if err != nil {
		t.Fatal(err)
	}
	if resp.StatusCode != 200 {
		t.Fatalf("Status was %d", resp.StatusCode)
	}
}

@xibz
Copy link
Contributor

xibz commented Jul 17, 2017

@mtestrot thank you for the example. I'll go ahead and discuss with the other SDKs to have some consistency on dealing with this issue.

@xibz
Copy link
Contributor

xibz commented Jul 20, 2017

@mtestrot - Taking a look at the code that you provided, can you instead just call get object instead of trying to build the request yourself? Why is it that you are trying to do this? What's the specific use case?

@xibz xibz added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jul 20, 2017
@mtestrot
Copy link
Author

mtestrot commented Aug 8, 2017

Hi,
sorry for the late response. I have been out of office.

The reason I'm building the request by myself is the reduced dependency for the GET part of the TestLocation function. In production the program which stores the S3 object is different from the program which retrieves/gets the object. So the GET part - apart from the signer - has no dependency on the sdk.

@jasdel jasdel removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Nov 2, 2017
@philip-firstorder
Copy link

I also got this problem, see here:

Original key I use on the uploader is this:
5bacdea9f893e03936044e3a/2018-09-27 - archive.zip

On s3 it gets saved to this:
5bacdea9f893e03936044e3a/2018-09-27+-+archive.zip

Smaller files return this (spaces are encoded):
5bacdea9f893e03936044e3a/2018-09-27%20-%20archive.zip

Large files return this (slash is encoded):
5bacdea9f893e03936044e3a%2F2018-09-27+-+archive.zip

@ktravelet
Copy link

Just bumped into this as well. We are saving backups to S3. I'm not sure what the threshold is but very small backups (< 1MB) don't encode while our medium (~14 MB) and larger (>100MB) size backups encode.

@jasdel
Copy link
Contributor

jasdel commented Feb 15, 2019

The S3 Upload manager will automatically switch to multipart mode when the content to upload is larger than Uploader.PartSize which defaults to 5 MB.

jasdel added a commit to jasdel/aws-sdk-go that referenced this issue Feb 15, 2019
Updates the Location returned value of S3 Upload's Multipart
UploadOutput type to be consistent with single part upload URL. This
update also brings the multipart upload Location inline with the S3
object URLs created by the SDK

Fix aws#1385
@jasdel
Copy link
Contributor

jasdel commented Feb 15, 2019

I created #2453 PR to address this issue of inconsistent URL locations being returned. The URL value returned is updated to use the same pattern as the Single part upload URL.

jasdel added a commit that referenced this issue Feb 18, 2019
Updates the Location returned value of S3 Upload's Multipart
UploadOutput type to be consistent with single part upload URL. This
update also brings the multipart upload Location inline with the S3
object URLs created by the SDK

Fix #1385
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

5 participants