Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 sync does not allow specification of file metadata headers e.g content-encoding #319

Closed
ratpik opened this issue Sep 5, 2013 · 20 comments

Comments

@ratpik
Copy link

ratpik commented Sep 5, 2013

Uploading to S3 by using the aws s3 sync should have an option to specify the headers for that request. Right now there doesn't seem to be any way to upload gzipped content and expect it to have the appropriate metadata

--add-header='Content-Encoding: gzip'

@onyxfish
Copy link

onyxfish commented Sep 5, 2013

+1, this makes the s3 tool unusable if you're using CloudFront and need to specify cache headers.

@NV
Copy link

NV commented Sep 6, 2013

What if I want to sync both gzipped and uncompressed files?

@ratpik
Copy link
Author

ratpik commented Sep 7, 2013

The two should be synced in separate commands. One with the compression headers and one without them.

@lectroidmarc
Copy link

+1 here too. Uploading gzip files and being able to set "Content-Encoding" and "Cache-Control" is important to us.

@garnaat
Copy link
Contributor

garnaat commented Sep 19, 2013

We have added a number of new options to the s3 commands such as --content-disposition, --content-encoding, --content-language, --cache-control. Please check out the interactive help page for details.

@garnaat garnaat closed this as completed Sep 19, 2013
@robeson
Copy link

robeson commented Oct 1, 2013

--content-disposition is listed above but not in the pull request merge:
#352

And, from what I can tell, it doesn't seem to be working. I can perform copies but ContentDisposition isn't set in the meta data. I'm doing the following:
aws s3 cp s3://bucket1/path/object s3://bucket1/path/object --content-disposition "attachment"

Am I missing something or is that functionality missing?

@garnaat
Copy link
Contributor

garnaat commented Oct 1, 2013

The comment for #352 is incorrect. There is a --content-disposition option and it seems to be working correctly for me. How are you determining that it is not set in the metadata? Try doing this:

aws s3api head-object --bucket bucket1 --key path/object1

and see if the content disposition is returned for the object.

@robeson
Copy link

robeson commented Oct 1, 2013

Thanks for your reply. Yes, that's how I'm checking and it's not there. I get this:

{
    "LastModified": "Tue, 01 Oct 2013 21:03:11 GMT", 
    "AcceptRanges": "bytes", 
    "ETag": "\"...\"", 
    "ContentType": "application/octet-stream", 
    "ContentLength": "15141142"
}

And I'm copying from one bucket to another, as follows:

aws s3 cp s3://bucket1/path/object s3://bucket2/path/object --content-disposition "attachment"

@garnaat
Copy link
Contributor

garnaat commented Oct 1, 2013

Okay, thanks for the additional info. I'll try to reproduce the problem locally and update here with my results.

@garnaat
Copy link
Contributor

garnaat commented Oct 1, 2013

I see what's happening.

If you do a cp or mv from a local file to S3, it is doing a PUT operation, basically creating a new object in S3. When creating a new object, you can specify a variety of metadata to be associated with that data. The content-disposition is one example and it seems to be working fine in this context.

When you do a cp or mv from S3 to S3, it is doing a COPY operation. This copies an existing object in S3 to another object in S3. When performing this COPY operation, you can use the x-amz-metadata-directive header to tell S3 whether it should copy the metadata from the original object or replace the metadata with new values provided in the operation.

We are not currently providing a way to set the value of the x-amz-metadata-directive header in the s3 command thus it is always using the default value which is copy. So, your new object in S3 has the exact same metadata as the original object in S3 and there is no way to override that.

We should create a separate issue for to track this.

@marianobntz
Copy link

How about Vary: Accept-Encoding parameter... It would be nice to be able to set that parameter too...

@revolunet
Copy link

Where is the "interactive help page" please ?
Can someone confirm we can set Cache-Control headers using sync ? (i need to set Expiration headers)

@revolunet
Copy link

ok using sync : aws s3 sync --acl public-read --cache-control "max-age=3600" --expires 2100-01-01T00:00:00Z /path/to/images s3://bucket/images

makmanalp added a commit to cid-harvard/atlas-subnational-api that referenced this issue Jan 21, 2016
because otherwise it keeps the metadata of the old files
aws/aws-cli#319 (comment) COL-819
@makmanalp
Copy link

Note for posterity, if you have issues with metadata headers not updating on sync, potentially see also #1145

@bruno-rossi-movile
Copy link

How can I download a file that is in Content-Encoding = gzip ?

cat test.json

��zl��i�4fv�����s�|6��C>����+�ݺ>�EDh�0���0��s�mU��R��]�B66Ļ�)�T���}�>@
is impossible to read, because has Content-Encoding = gzip.
I need make a sync from bucket to local , but all filies in bucket is in gzip, How can I download for possible read ?

@AlexeyPanda
Copy link

Hi, how I can find the object to metadata in aws CLI. ?

{
"AcceptRanges": "bytes",
"ContentType": "text/plain",
"LastModified": "Tue, 15 Mar 2016 12:38:36 GMT",
"ContentLength": 230139,
"ETag": ""3afd38518d72b0b83fa7102b37cc3c79"",
"Metadata": {
"1": "1",
string "metadata" keys "1","1"

@ptsteadman
Copy link

ptsteadman commented Jul 25, 2016

I have the same problem as Bruno, aws s3 cp s3://<bucket>/<file> --endpoint 'http://s3.amazonaws.com' . results in a gzipped file. Unzipping the file confirms that the file is not corrupt. I tried added --content-encoding 'gzip' but it did not help.

@monty241
Copy link

It is great that content-encoding can be set, but transparant on the fly zip/unzip would even be better. Now you always have to pre-process the files, whereas in many scenarios it could be done during the sync (and with twice the threads maybe even as fast).

@yvele
Copy link

yvele commented Mar 30, 2017

That's why I created https://github.com/yvele/poosh which allow a metadata configuration file based on glob patterns:

{
  plugins : ["s3"],
  baseDir : "./deploy",
  remote  : "s3-us-west-2.amazonaws.com/my-bucket",

  each: [{
    headers   : { "cache-control": { cacheable: "public" } }
  }, {
    match     : "**/*.{html,css,js}",
    gzip      : true,
    headers   : { "cache-control": { maxAge: "48 hours" } }
  }, {
    match     : "**/*.{jpg,png,gif,ico}",
    gzip      : false,
    headers   : { "cache-control": { maxAge: "1 year" } }
  }, {
    match     : "**/*.html",
    priority  : -1
  }]
}

I wish AWS add more control over headers while using aws s3 sync

@akotranza
Copy link

WholeFoods later this is still an unmitigated disaster. When using the CLI to COPY or SYNC from an s3 source to an s3 destination it should (with no extra parameters or caveats about multipart uploads) copy the metadata. How is this too much to ask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests