Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the raw stream to prevent decoding the response #1435

Merged
merged 1 commit into from
Jan 7, 2014

Conversation

dstufft
Copy link
Member

@dstufft dstufft commented Jan 7, 2014

No description provided.

dstufft added a commit that referenced this pull request Jan 7, 2014
Use the raw stream to prevent decoding the response
@dstufft dstufft merged commit 835e6d7 into pypa:1.5.X Jan 7, 2014
@dstufft dstufft deleted the use-raw-stream branch January 7, 2014 18:29
@tomprince
Copy link

This seems to cause the exact opposite error on other servers.

@dstufft
Copy link
Member Author

dstufft commented Feb 6, 2014

Can you give an example?

On Feb 5, 2014, at 2:28 PM, Tom Prince notifications@github.com wrote:

This seems to cause the exact opposite error on other servers.


Reply to this email directly or view it on GitHub.

@tomprince
Copy link

pip install -U --force-reinstall --no-index --find-links http://data.hybridcluster.net/pip-1435 ujson==1.33

@tomprince
Copy link

Exception:
Traceback (most recent call last):
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/basecommand.py", line 122, in main
    status = self.run(options, args)
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/commands/install.py", line 274, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/req.py", line 1173, in prepare_files
    self.unpack_url(url, location, self.is_download)
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/req.py", line 1320, in unpack_url
    retval = unpack_http_url(link, location, self.download_cache, self.download_dir, self.session)
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/download.py", line 587, in unpack_http_url
    unpack_file(temp_location, location, content_type, link)
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/util.py", line 621, in unpack_file
    unzip_file(filename, location, flatten=not filename.endswith(('.pybundle', '.whl')))
  File "/home/tomprince/dev/b00ef9c48c7082c6/lib/python2.7/site-packages/pip/util.py", line 491, in unzip_file
    zip = zipfile.ZipFile(zipfp)
  File "/usr/lib/python2.7/zipfile.py", line 766, in __init__
    self._RealGetContents()
  File "/usr/lib/python2.7/zipfile.py", line 807, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file

Also:

$ pip install -d . --no-index --find-links http://data.hybridcluster.net/pip-1435 ujson==1.33
$ file usjon.zip
ujson-1.33.zip: gzip compressed data, from Unix
$ zcat ujson-1.33.zip | file -
/dev/stdin: Zip archive data, at least v2.0 to extract
$ curl http://data.hybridcluster.net/pip-1435/ujson-1.33.zip | file -
/dev/stdin: Zip archive data, at least v2.0 to extract

@dstufft
Copy link
Member Author

dstufft commented Feb 13, 2014

So, the reason this happens is your server is gzip encoding a zip file and we no longer decode that by default because some servers if you tell them to serve a .tar.gz file they serve an application/tar file with Content-Encoding: gzip. This probably previously worked because prior to this we didn't send the header to accept gzip encoding.

Strictly speaking we should be decoding Content-Encoding I suppose. We'll have to figure out how to solve this better.

@tomprince
Copy link

I also put a setuptools tar.gz there, and it gets double compressed and so pip chokes on it as well.

@dstufft
Copy link
Member Author

dstufft commented Feb 13, 2014

Probably the right answer is to revert this, and make pip able to tell if we have a plain tarfile or a tar.gz even though the filename will be .tar.gz. Historically we've just used the filename for this.

@dstufft
Copy link
Member Author

dstufft commented Feb 13, 2014

Grr, that's not completely right either, because it was breaking checksums too. Probably we're going to have to pick which one to support.

@tomprince
Copy link

This is a (I think) somewhat vanilla apache server.

@tomprince
Copy link

I would err on the side of not supporting silly server written by people that don't understand what HTTP headers are supposed to mean.

@dstufft
Copy link
Member Author

dstufft commented Feb 13, 2014

No your server is doing it correctly. Well gziping a already compressed file is pointless, but it's not wrong. We switched to doing it incorrectly to fix things for one group, and that broke things for another group.

Although I may have a different/better answer. If we can get requests to stop sending the accept header for gzip compression then I think the .tar.gz case this originally was trying to fix will still work and your server should work fine because it won't be compressing on the fly. I need to see if we can do that with requests and if so if it actually does that.

@tomprince
Copy link

Yeah. I just turned gzip compression off in the directory we are actually using, for .zip files.

@SmileyChris
Copy link

Yeah, also getting double compressed tar.gz with --find-links

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 5, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants