Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendored library cannot be loaded #29

Closed
stefanv opened this issue Apr 14, 2016 · 13 comments · Fixed by #173
Closed

Vendored library cannot be loaded #29

stefanv opened this issue Apr 14, 2016 · 13 comments · Fixed by #173

Comments

@stefanv
Copy link

stefanv commented Apr 14, 2016

I used the manylinux docker infrastructure to build:

http://travis-wheels.scikit-image.org/netCDF4-1.2.3.1-cp34-cp34m-manylinux1_x86_64.whl

Upon import, I see:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/stefan/envs/py3/lib/python3.4/site-packages/netCDF4/__init__.py", line 3, in <module>
    from ._netCDF4 import *
ImportError: libhdf5_hl-9c7ba457.so.10.0.2: cannot open shared object file: No such file or directory

Looking at the vendored libraries, the RPATH is invalid for libnetcdf (not for libhdf5)--possibly has something to do with original libnetcdf library already having an RPATH entry.

Build recipe can be found here:

https://github.com/stefanv/manylinux-builds/blob/build_netcdf/build_netcdfs.sh

It was run on yesterday's version of the manylinux docker image.

/cc @njsmith @matthew-brett

@njsmith
Copy link
Member

njsmith commented Apr 14, 2016

Another weird thing is that the _netCDF4 extension module has RUNPATH set to point to $ORIGIN/.libs, which is good, but then the vendored libraries like .libs/libnetcdf4-... have RPATH set instead (not RUNPATH).

(The difference between RUNPATH and RPATH: RPATH overrides LD_LIBRARY_PATH; LD_LIBRARY_PATH overrides RUNPATH. But if both are set, RPATH is ignored. RPATH is theoretically deprecated, if we care. The difference doesn't matter much to us, but I guess we should use RUNPATH consistently, because it reduces the chance of weird RPATH/RUNPATH interaction issues, it accords with the ld.so maintainers request to prefer RUNPATH, and I guess letting people use LD_LIBRARY_PATH to monkeypatch our search path is okay in extremis -- it shouldn't cause any problems in ordinary usage due to our use of unique mangled sonames.)

@rmcgibbo
Copy link
Member

I will take a look at this.

Sent from my iPhone

On Apr 14, 2016, at 7:28 PM, Stefan van der Walt notifications@github.com wrote:

I used the manylinux docker infrastructure to build:

http://travis-wheels.scikit-image.org/netCDF4-1.2.3.1-cp34-cp34m-manylinux1_x86_64.whl

Upon import, I see:

Traceback (most recent call last):
File "", line 1, in
File "/home/stefan/envs/py3/lib/python3.4/site-packages/netCDF4/init.py", line 3, in
from ._netCDF4 import *
ImportError: libhdf5_hl-9c7ba457.so.10.0.2: cannot open shared object file: No such file or directory
Looking at the vendored libraries, the RPATH is invalid for libnetcdf (not for libhdf5)--possibly has something to do with original libnetcdf library already having an RPATH entry.

Build recipe can be found here:

https://github.com/stefanv/manylinux-builds/blob/build_netcdf/build_netcdfs.sh

It was run on yesterday's version of the manylinux docker image.

/cc @njsmith @matthew-brett


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub

@stefanv
Copy link
Author

stefanv commented Apr 18, 2016

@rmcgibbo Did you manage to figure out what's happening here?

@rmcgibbo
Copy link
Member

I didn't get a chance to look at it yet, no.

@matthew-brett
Copy link
Contributor

I think I've run into the same problem - please correct me if I'm wrong. For h5py wheels I am getting:

Successfully installed h5py-2.6.0 six-1.10.0
Traceback (most recent call last):
  File "../run_tests.py", line 3, in <module>
    import h5py
  File "/venv/lib/python3.4/site-packages/h5py/__init__.py", line 24, in <module>
    from . import _errors
ImportError: libz-a147dcb0.so.1.2.3: cannot open shared object file: No such file or directory

https://s3.amazonaws.com/archive.travis-ci.org/jobs/139365850/log.txt

This is only for Python 3 (I'm testing 3.4 and 3.5). The log of the build (same link as above) earlier tells me that:

Grafting: /lib64/libz.so.1.2.3 -> h5py/.libs/libz-a147dcb0.so.1.2.3

This is with the current manylinux1 docker images.

@matthew-brett
Copy link
Contributor

@matthew-brett
Copy link
Contributor

$ readelf --dynamic _errors.cpython-35m-x86_64-linux-gnu.so | grep -i path
 0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN/.libs]
$ readelf --dynamic _errors.cpython-35m-x86_64-linux-gnu.so | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libhdf5-62da5c04.so.10.2.0]
 0x0000000000000001 (NEEDED)             Shared library: [libhdf5_hl-1a1d7c4c.so.10.1.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
$ ls .libs/
libhdf5-62da5c04.so.10.2.0  libhdf5_hl-1a1d7c4c.so.10.1.0  libz-a147dcb0.so.1.2.3
$ readelf --dynamic .libs/libhdf5-62da5c04.so.10.2.0 | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libz-a147dcb0.so.1.2.3]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
$ readelf --dynamic .libs/libhdf5-62da5c04.so.10.2.0 | grep -i path
$ readelf --dynamic .libs/libhdf5_hl-1a1d7c4c.so.10.1.0| grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libhdf5-62da5c04.so.10.2.0]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libz-a147dcb0.so.1.2.3]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
$ readelf --dynamic .libs/libhdf5_hl-1a1d7c4c.so.10.1.0 | grep -i path
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.]

Unfixed at http://nipy.bic.berkeley.edu/scipy_installers/tmp/unfixed/h5py-2.6.0-cp35-cp35m-linux_x86_64.whl

Fixed at http://nipy.bic.berkeley.edu/scipy_installers/tmp/fixed/h5py-2.6.0-cp35-cp35m-manylinux1_x86_64.whl

@matthew-brett
Copy link
Contributor

I think the problem is the lack of (RPATH) Library rpath: [$ORIGIN/.] in the libhdf5 library.

In a working wheel (at http://nipy.bic.berkeley.edu/manylinux/h5py-2.6.0-cp35-cp35m-manylinux1_x86_64.whl):

$ readelf --dynamic .libs/libhdf5-63ad7a9d.so.10.1.0 | grep -i path
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.]

@matthew-brett
Copy link
Contributor

Depends on version of hdf5 library. If I unpack hdf5 version 1.8.16 and build the wheel, then auditwheel works correctly:

http://nipy.bic.berkeley.edu/manylinux/hdf5-1.8.16-x86_64.tgz

$ auditwheel repair h5py-2.6.0-cp35-cp35m-linux_x86_64.whl -w tmp
Repairing h5py-2.6.0-cp35-cp35m-linux_x86_64.whl
Grafting: /usr/local/lib/libhdf5_hl.so.10.0.2 -> h5py/.libs/libhdf5_hl-9c7ba457.so.10.0.2
Setting RPATH: h5py/.libs/libhdf5_hl-9c7ba457.so.10.0.2 to "$ORIGIN/."
Grafting: /usr/local/lib/libsz.so.2.0.0 -> h5py/.libs/libsz-d415f9c7.so.2.0.0
Grafting: /usr/local/lib/libhdf5.so.10.1.0 -> h5py/.libs/libhdf5-63ad7a9d.so.10.1.0
Setting RPATH: h5py/.libs/libhdf5-63ad7a9d.so.10.1.0 to "$ORIGIN/."
Grafting: /lib64/libz.so.1.2.3 -> h5py/.libs/libz-a147dcb0.so.1.2.3

Note RPATH line for libhdf5 library.

readelf --dynamic .libs/libhdf5-63ad7a9d.so.10.1.0 
...
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.]
...

If I unpack hdf5 1.8.17, it does not work correctly:

http://nipy.bic.berkeley.edu/manylinux/hdf5-1.8.17-x86_64.tgz

$ auditwheel repair h5*whl
Repairing h5py-2.6.0-cp35-cp35m-linux_x86_64.whl
Grafting: /usr/local/lib/libhdf5_hl.so.10.1.0 -> h5py/.libs/libhdf5_hl-1a1d7c4c.so.10.1.0
Setting RPATH: h5py/.libs/libhdf5_hl-1a1d7c4c.so.10.1.0 to "$ORIGIN/."
Grafting: /usr/local/lib/libhdf5.so.10.2.0 -> h5py/.libs/libhdf5-62da5c04.so.10.2.0
Grafting: /lib64/libz.so.1.2.3 -> h5py/.libs/libz-a147dcb0.so.1.2.3

Note lack of RPATH line for libhdf5 library.

$ readelf --dynamic .libs/libhdf5-62da5c04.so.10.2.0 | grep -i path
[no output]

For the 1.8.16 version of hdf5, for which delocation does work, libhdf5 has rpath set:

$ readelf --dynamic libhdf5.so.10.1.0 | grep -i path
 0x000000000000000f (RPATH)              Library rpath: [/usr/local/lib]

For the 1.8.17, libhdf5 does not have rpath set:

$readelf --dynamic /usr/local/lib/libhdf5.so.10.2.0 | grep -i path

Setting the RPATH to the built libhdf5 library makes auditwheel work correctly:

$ patchelf --set-rpath /usr/local/lib usr/local/lib/libhdf5.so.10.2.0
$ auditwheel repair h5*whl
Repairing h5py-2.6.0-cp35-cp35m-linux_x86_64.whl
Grafting: /lib64/libz.so.1.2.3 -> h5py/.libs/libz-a147dcb0.so.1.2.3
Grafting: /usr/local/lib/libhdf5.so.10.2.0 -> h5py/.libs/libhdf5-673c3b5a.so.10.2.0
Setting RPATH: h5py/.libs/libhdf5-673c3b5a.so.10.2.0 to "$ORIGIN/."
Grafting: /usr/local/lib/libhdf5_hl.so.10.1.0 -> h5py/.libs/libhdf5_hl-1a1d7c4c.so.10.1.0
Setting RPATH: h5py/.libs/libhdf5_hl-1a1d7c4c.so.10.1.0 to "$ORIGIN/."

@matthew-brett
Copy link
Contributor

I think it's reasonable for the libhdf5.so library not to have its own directory on the RPATH, because it loads no libraries from its own path, so although this is a change in libhdf5, I don't think it's a bug in libhdf5.

@ionelmc
Copy link

ionelmc commented Nov 2, 2017

Is there any workaround for this problem?

@daa
Copy link
Contributor

daa commented Dec 12, 2018

This may be related: https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1253638 (dynamic linker does not use DT_RUNPATH for transitive dependencies). And excerpt from that discussion:

It looks like it's the expected behaviour after all. From "Shared Object Dependencies" section in [1]:
"""
The set of directories specified by a given DT_RUNPATH entry is used to find only the immediate > dependencies of the executable or shared object containing the DT_RUNPATH entry. That is, it is used only for those dependencies contained in the DT_NEEDED entries of the dynamic structure containing the DT_RUNPATH entry, itself. One object's DT_RUNPATH entry does not affect the search for any other object's dependencies.
"""
[1] https://refspecs.linuxfoundation.org/elf/gabi4+/ch5.dynamic.html#shobj_dependencies

So it seems that auditwheel should set RUNPATH for grafted libraries.

@daa
Copy link
Contributor

daa commented Dec 12, 2018

On the other hand auditwheel calls patchelf with --force-rpath option which means that patchelf must use RPATH entry. However when RUNPATH on shared object is set patchelf changes it and does not add RPATH entry. And in attached h5py wheels exactly this happens:

# download h5py wheel from this issue, unpack
$ unzip h5py-2.6.0-cp35-cp35m-linux_x86_64.whl -d 4
# see RUNPATH is set
$ readelf -a 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so | grep PATH
 0x000000000000001d (RUNPATH)            Library runpath: [/opt/local/lib:/usr/local/lib]
# change, note used --force-rpath
$ patchelf --set-rpath '$ORIGIN/.libs' --force-rpath 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so 
# see RUNPATH changed, not RPATH
$ readelf -a 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so | grep PATH
 0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN/.libs]
# now clear
$ patchelf --remove-rpath 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so 
# not RUNPATH or RPATH is set
$ readelf -a 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so | grep PATH
# change rpath
$ patchelf --set-rpath '$ORIGIN/.libs' --force-rpath 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so 
# see RPATH is set, not RUNPATH
$ readelf -a 4/h5py/_errors.cpython-35m-x86_64-linux-gnu.so | grep PATH
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.libs]

And combining my 2 messages to a short conclusion: RUNPATH entry set on python extension makes auditwheel with the help of patchelf patch exactly RUNPATH and because it is not transitive vendored library cannot be loaded. To solve this auditwheel must first clear rpath (patchelf --remove-rpath) before setting new.

Related patchelf issue: NixOS/patchelf#94

daa added a commit to daa/auditwheel that referenced this issue Dec 12, 2018
daa added a commit to daa/auditwheel that referenced this issue Jun 26, 2019
daa added a commit to daa/auditwheel that referenced this issue Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants