Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distutils produces metadata in unknown encoding #197

Closed
ghost opened this issue May 3, 2014 · 3 comments
Closed

Distutils produces metadata in unknown encoding #197

ghost opened this issue May 3, 2014 · 3 comments

Comments

@ghost
Copy link

ghost commented May 3, 2014

Originally reported by: jaraco (Bitbucket: jaraco, GitHub: jaraco)


In Pull Request 45, Jurko observed that on Python 3.1, Python will generate the metadata files in an encoding relative to the build user's environment. Furthermore, starting with Python 3.2 but also on Python 2.6 and 2.7, the content is encoded using UTF-8.

pkg_resources currently assumes the metadata is UTF-8, so if non-ASCII characters are present and egg_info is run on Python 3.1 or earlier, the resulting metadata will fail to load on Python 3.2+.


@ghost
Copy link
Author

ghost commented May 3, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Monkey-patch the write_pkg_info method on Python 3.1 DistributionMetadata. Fixes #197

@ghost
Copy link
Author

ghost commented May 4, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


Yup, just checked 2.6.6, 2.7.6 & 3.4.4 and they all do correct utf-8 encoding (although Python2 versions require that the data be given as unicode and not str in order for the encoding to be applied).

@ghost
Copy link
Author

ghost commented May 12, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


I added pull request #52, generalizing the solution for this issue to more Python versions.

Some more background information on this issue:

  • Python 2.x supports writing package meta data given as utf-8 encoded byte strings, and since Python 2.6 it also supports writing package meta data given as a unicode string (CPython commit 4c683ec4415b3c4bfbc7fe7a836b949cb7beea03)
  • Python 3.x only supports writing package meta data given as a unicode string Python [3.0 - 3.2.2> does not support writing package meta data containing non-ASCII characters due to a distutils bug
  • Python 3.2.2 fixes the distutils bug (CPython commit fb4d2e6d393e96baac13c4efc216e361bf12c293)

setuptools commit 1cd816bb7c933eecd9d8464e054b21c7d5daf2df works around the non-ASCII character issue for Python version 3.1.

Pull request #52 applies the same workaround for Python version range [3.0 - 3.2.2>.

Hope this helps.

Best regards,
Jurko Gospodnetić

@ghost ghost added trivial bug labels Mar 29, 2016
@ghost ghost closed this as completed Mar 29, 2016
jaraco added a commit that referenced this issue Feb 6, 2023
Fix MinGW-w64 segmentation fault
jaraco added a commit that referenced this issue Feb 6, 2023
This reverts commit 0171aee, reversing
changes made to fb2a173.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants