Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure when pre-processor generates non-ASCII characters #95

Closed
meklu opened this issue Oct 29, 2019 · 2 comments
Closed

Build failure when pre-processor generates non-ASCII characters #95

meklu opened this issue Oct 29, 2019 · 2 comments
Labels

Comments

@meklu
Copy link

meklu commented Oct 29, 2019

With recent versions of GCC, the C pre-processor may generate some nasty linemarkers with non-ASCII characters. This may result in output like the following when trying to build the project:

meklu@holmes pts/9 » ~/src/python-zstandard (r:0)
% python2 setup.py build
Traceback (most recent call last):
  File "setup.py", line 70, in <module>
    import make_cffi
  File "/home/meklu/src/python-zstandard/make_cffi.py", line 204, in <module>
    ffi.cdef(b'\n'.join(cdeflines).decode('latin1'))
  File "/usr/lib64/python2.7/site-packages/cffi/api.py", line 112, in cdef
    self._cdef(csource, override=override, packed=packed, pack=pack)
  File "/usr/lib64/python2.7/site-packages/cffi/api.py", line 123, in _cdef
    csource = csource.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 32-33: ordinal not in range(128)

For some reason this problem doesn't happen with Python 3 on the same machine. My suspicion is that something in the Python 3 setup sets a different locale.

To find out what the heck was going on, I ran the following:

meklu@holmes pts/9 » ~/src/python-zstandard (r:0)
%  find zstd/ -name '*\.c' -print0 | xargs -0r -I'{}' -n1 -- sh -c 'gcc -Izstd/common/ -Izstd/compress/ -Izstd/decompress/ -zstd/dictBuilder/ -Izstd/ -E "$1" | grep -P "[\\x80-\\xFF]"' -- '{}' 
# 1 "<sisäinen>"
# 1 "<sisäinen>"
/* ~38 lines total of this message saying "<built-in>" */

Some possible workarounds are as follows:

  • Set LC_ALL=C at build time
  • Add -P to the compiler options in <projroot>/make_cffi.py when using GCC
  • Edit /usr/lib64/python2.7/site-packages/cffi/api.py:123 to read .encode('ascii', 'ignore')
  • Use anything other than GCC

I opted for the second one myself.

For reference, here's my locale:

meklu@holmes pts/9 » ~/src/python-zstandard (r:0)
% locale
LANG=fi_FI.UTF-8
LC_CTYPE="fi_FI.UTF-8"
LC_NUMERIC="fi_FI.UTF-8"
LC_TIME="fi_FI.UTF-8"
LC_COLLATE="fi_FI.UTF-8"
LC_MONETARY="fi_FI.UTF-8"
LC_MESSAGES="fi_FI.UTF-8"
LC_PAPER="fi_FI.UTF-8"
LC_NAME="fi_FI.UTF-8"
LC_ADDRESS="fi_FI.UTF-8"
LC_TELEPHONE="fi_FI.UTF-8"
LC_MEASUREMENT="fi_FI.UTF-8"
LC_IDENTIFICATION="fi_FI.UTF-8"
LC_ALL=fi_FI.UTF-8

The problem manifests itself with at least zstandard-0.12.0 and current HEAD along with cffi-1.12.3.

@indygreg
Copy link
Owner

I pushed a change to set LC_ALL=C. I'm unsure if this fixes things, however. Could you please test?

@meklu
Copy link
Author

meklu commented Jun 15, 2020

That commit fixes the issue on my end. I tried building 228cf82 (0.14.0), fbd77d2, fde84bf (the commit previous to your fix) and e02ddf9 (0.13.0) and of those only the first two (the ones which include the fix) were buildable without additional workarounds. Good work!

I do wonder a little why the Python 3 versions don't suffer from the same issue. I'd guess they might clear the environment wholly or partially when spawning processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants