Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuring on Alpine Linux wants to build bitarray from source #1262

Closed
sschuberth opened this issue Nov 6, 2018 · 14 comments
Closed

Configuring on Alpine Linux wants to build bitarray from source #1262

sschuberth opened this issue Nov 6, 2018 · 14 comments

Comments

@sschuberth
Copy link
Collaborator

I'm trying to create a minimal Docker image based on Alpine Linux to run ScanCode. In particular, I do not want to build any native packages required by ScanCode from source (as the Docker base image does not contain gcc, and I do not want to use Docker multi-stage builds). Thus, I'm not installing ScanCode via pip, but from the release archive which comes with pre-built binaries for several platforms.

For some reason, running ./configure etc/conf when creating the image does not seem to pick up the bitarray binaries but wants to build them from source:

Installing collected packages: bitarray, intbitset, boolean.py, license-expression, pyahocorasick, Beautifulsoup, Beautifulsoup4, webencodings, html5lib, pycryptodome, pdfminer.six, binaryornot, pygments, pefile, lxml, idna, urllib3, requests, pymaven-patch, schematics-patched, packageurl-python, click, colorama, pluggy, attrs, typing, MarkupSafe, jinja2, simplejson, ply, SPARQLWrapper, pyparsing, isodate, rdflib, spdx-tools, unicodecsv, psutil, zc.lockfile, contextlib2, pytz, tempora, jaraco.timing, yg.lockfile, scancode-toolkit
  Running setup.py install for bitarray: started
    Running setup.py install for bitarray: finished with status 'error'
    Complete output from command /usr/local/scancode-toolkit-2.9.2/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-9By3EB/bitarray/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-qQss3M-record/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/scancode-toolkit-2.9.2/include/site/python2.7/bitarray:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-2.7
    creating build/lib.linux-x86_64-2.7/bitarray
    copying bitarray/test_bitarray.py -> build/lib.linux-x86_64-2.7/bitarray
    copying bitarray/__init__.py -> build/lib.linux-x86_64-2.7/bitarray
    running build_ext
    building 'bitarray._bitarray' extension
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/bitarray
    gcc -fno-strict-aliasing -Os -fomit-frame-pointer -g -DNDEBUG -Os -fomit-frame-pointer -g -DTHREAD_STACK_SIZE=0x100000 -fPIC -I/usr/include/python2.7 -c bitarray/_bitarray.c -o build/temp.linux-x86_64-2.7/bitarray/_bitarray.o
    unable to execute 'gcc': No such file or directory
    error: command 'gcc' failed with exit status 1

Any idea why that is?

PS: I've been reading through several related issues like #636 but these do not seem to provide a final solution.

@sschuberth
Copy link
Collaborator Author

@sschuberth
Copy link
Collaborator Author

Inside the Docker container uname -a returns

Linux 38fcbfe18b5f 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018 x86_64 Linux

So I'm wondering why ScanCode doesn't use thirdparty/bitarray-0.8.1-cp27-cp27m-manylinux1_x86_64.whl.

@sschuberth
Copy link
Collaborator Author

sschuberth commented Nov 6, 2018

This is partly addressed by #1263. With that merged, I get

* Configuring ScanCode for first use...
  Could not find a version that satisfies the requirement extractcode-libarchive (from scancode-toolkit==2.9.8->-r /usr/local/scancode-toolkit-sys-platform-startswith/etc/conf/base.txt (line 10)) (from versions: )

No matching distribution found for extractcode-libarchive (from scancode-toolkit==2.9.8->-r /usr/local/scancode-toolkit-sys-platform-startswith/etc/conf/base.txt (line 10))

Failed to execute command:
pip install --upgrade --no-index --no-cache-dir --quiet --find-links="/usr/local/scancode-toolkit-sys-platform-startswith/thirdparty" -r "/usr/local/scancode-toolkit-sys-platform-startswith/etc/conf/base.txt".     Aborting...

@pombredanne
Copy link
Contributor

@sschuberth @znerd started to work on an Alpine Dockerfile in #636 FWIW

There is no documentation yet on how to port to a new Linux distro or OS but at a high level here is what is needed:

  1. ensure all wheels that contain native are re-built from sources. This may not be needed if this a Linux is supported by the "manylinux" builds... but Alpine is focused on static linking with a C library and does not provide a C runtime library as far as I know? Therefore rebuilding would be needed.

So I'm wondering why ScanCode doesn't use thirdparty/bitarray-0.8.1-cp27-cp27m-manylinux1_x86_64.whl.

That's what is likely the case here with Alpine. See pypa/packaging-problems#69 and pypa/manylinux#37

  1. much in the same way you need other native libraries for typecode and extractcode. These are made available as plugins now either providing the library or providing a path to a system library. This is tracked in Distro packages #487 , De-vendor prebuilt binaries to ease packaging for Linux distros #469 and you can also see what is done on the FreeBSD side in Problems building on FreeBSD #1147 as there is active work on a port to FreeBSD. You can see the current shape of each "bundled" plugins in /plugins https://github.com/nexB/scancode-toolkit/tree/develop/plugins

  2. there may be a need to relax and update some of the checks done in etc/configure.py and ensure that the OS is properly supported in commoncode/system.py

  3. at that stage everything should minimally run and the next step would be to run the test suite and either fix tests, add or update expectations and fix code bugs

@pombredanne
Copy link
Contributor

pombredanne commented Nov 6, 2018

Now this means that in all cases you are likely to need to compile things from sources to get a container working and that means having either pre-built outside of the Dockerfile OR building in the Dockerfile.

In the later case since you likely want to remove all the build tools, compiler, etc... the way would likely be to have a single line Dockerfile instruction that install the build environment, install scancode and removes the build env files , all at once. This is a tad ugly but I see this commonly done in Dockerfiles.
Or you could have many Dockerfile lines and later squash your image layers to get rid of the build/compiler tools installation/removal layers

@pombredanne
Copy link
Contributor

@sschuberth for a start, what do you get in Alpine if you were to run these?
(this is what is checked by configure.py and commoncode.system)

python -c "import sys;print('sys.maxsize:', sys.maxsize)"
python -c "import sys;print('sys.platform:', sys.platform)"
python -c "import sys;print('sys.version_info:', sys.version_info)"

@pombredanne
Copy link
Contributor

pombredanne commented Nov 6, 2018

@sschuberth btw, why Alpine vs Debian or else? is this really smaller and faster?

@pombredanne
Copy link
Contributor

@sschuberth also what is the $SCANCODE_VERSION you are using?

@sschuberth sschuberth changed the title Configuring on Alpine Linux wants to build bitarray form source Configuring on Alpine Linux wants to build bitarray from source Nov 6, 2018
@sschuberth
Copy link
Collaborator Author

update some of the checks done in etc/configure.py

Indeed. Please see #1263.

why Alpine vs Debian or else? is this really smaller and faster?

Smaller, yes.

also what is the $SCANCODE_VERSION you are using?

Usually 2.9.2, but right now I'm using my custom branch based on develop.

@sschuberth
Copy link
Collaborator Author

either pre-built outside of the Dockerfile OR building in the Dockerfile

I believe that's exactly what multi-stage builds are for. I'll probably give them a look anyway.

@pombredanne
Copy link
Contributor

@sschuberth thx... I was not aware of multi-stage builds and this seems an ok way to deal with the issue (vs. the older ways). Here I do not see how to work around building from sources on Alpine

@pombredanne
Copy link
Contributor

@sschuberth some pointers to build the non-wheel binaries from sources:

... should be the scripts to build these.
Or you may be able to use a standard package from Alpine too? or a packaging script from them...
I am here to help you as this is new and unchartered territory.

pombredanne added a commit to aboutcode-org/scancode-thirdparty-src that referenced this issue Nov 7, 2018
Improve build scripts to work without bash
This is to support building an Alpine-based Docker image
See aboutcode-org/scancode-toolkit#1262

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 8, 2018
For now these contain placeholder empty .so/.exe

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 15, 2018
For now these contain placeholder empty .so/.exe

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Nov 15, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@sschuberth
Copy link
Collaborator Author

sschuberth commented Dec 14, 2018

some pointers to build the non-wheel binaries from sources

Thanks. So even if that would be done, am I correct to assume that we'd still need to find a way to build the binaries for the wheels that contain native code? If so, how would that need to be done? Is there a straight-forward way to mostly automate that?

Edit: Looking back at my own original post it looks like pip will automatically fall back to building the native wheel from source if manylinux does not work for the current Linux distribution, like Alpine. Is that correct?

@sschuberth
Copy link
Collaborator Author

By now I believe the better approach is to simply install glibc in Alpine rather than compiling native packages for musl-libc, so I'll close this to consolidate the discussion at #636.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants