Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

OSError: libopenblas.so.0: cannot open shared object file: No such file or directory in mxnet 1.8.0 #20068

Closed
harupy opened this issue Mar 22, 2021 · 10 comments

Comments

@harupy
Copy link

harupy commented Mar 22, 2021

Description

mxnet 1.8.0 emits the following error when running import mxnet:

OSError: libopenblas.so.0: cannot open shared object file: No such file or directory

Error Message

(generated from the dockerfile attached in the to-reproduce section)

+ Step 1/3 : FROM python:3.7
 ---> 7fefbebd95b5
+ Step 2/3 : RUN pip install mxnet
 ---> Running in fc634966f9aa
Collecting mxnet
  Downloading mxnet-1.8.0-py2.py3-none-manylinux2014_x86_64.whl (38.7 MB)
Collecting graphviz<0.9.0,>=0.8.1
  Downloading graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting numpy<2.0.0,>1.16.0
  Downloading numpy-1.20.1-cp37-cp37m-manylinux2010_x86_64.whl (15.3 MB)
Collecting requests<3,>=2.20.0
  Downloading requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
  Downloading certifi-2020.12.5-py2.py3-none-any.whl (147 kB)
Collecting chardet<5,>=3.0.2
  Downloading chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting idna<3,>=2.5
  Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.4-py2.py3-none-any.whl (153 kB)
Installing collected packages: urllib3, idna, chardet, certifi, requests, numpy, graphviz, mxnet
+ Successfully installed certifi-2020.12.5 chardet-4.0.0 graphviz-0.8.4 idna-2.10 mxnet-1.8.0 numpy-1.20.1 requests-2.25.1 urllib3-1.26.4
WARNING: You are using pip version 20.3.1; however, version 21.0.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
Removing intermediate container fc634966f9aa
 ---> b1c12d9f4376
+ Step 3/3 : RUN python -c "import mxnet"
 ---> Running in 8362fc58b280
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/mxnet/__init__.py", line 23, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.7/site-packages/mxnet/context.py", line 23, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.7/site-packages/mxnet/base.py", line 351, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.7/site-packages/mxnet/base.py", line 342, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/local/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
+ OSError: libopenblas.so.0: cannot open shared object file: No such file or directory
The command '/bin/sh -c python -c "import mxnet"' returned a non-zero code: 1

(key lines are colored green)

To Reproduce

Steps to reproduce

  1. Prepare the following dockerfile:
FROM python:3.7

RUN pip install mxnet
RUN python -c "import mxnet"
  1. Then, run docker build .

What have you tried to solve it?

Environment

We recommend using our script for collecting the diagnostic information with the following command
curl --retry 10 -s https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py | python3

Environment Information
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       39 bits physical, 48 bits virtual
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           8
Vendor ID:           GenuineIntel
CPU family:          6
Model:               158
Model name:          Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Stepping:            13
CPU MHz:             2400.000
BogoMIPS:            4800.00
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            16384K
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch pti fsgsbase bmi1 avx2 bmi2 erms xsaveopt arat
----------Python Info----------
Version      : 3.7.9
Compiler     : GCC 8.3.0
Build        : ('default', 'Nov 18 2020 14:10:47')
Arch         : ('64bit', 'ELF')
------------Pip Info-----------
Version      : 20.3.1
Directory    : /usr/local/lib/python3.7/site-packages/pip
----------MXNet Info-----------
No MXNet installed.
----------System Info----------
Platform     : Linux-4.19.76-linuxkit-x86_64-with-debian-10.6
system       : Linux
node         : 9bcb86c4d6cf
release      : 4.19.76-linuxkit
version      : #1 SMP Tue May 26 11:42:35 UTC 2020
----------Hardware Info----------
machine      : x86_64
processor    : 
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0155 sec, LOAD: 0.9034 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1894 sec, LOAD: 0.2668 sec.
Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1091)>, DNS finished in 0.3720698356628418 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0193 sec, LOAD: 0.5068 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0474 sec, LOAD: 0.5751 sec.
Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.01907634735107422 sec.
----------Environment----------
Removing intermediate container 9bcb86c4d6cf
@github-actions
Copy link

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

@harupy harupy changed the title OSError: libopenblas.so.0: cannot open shared object file: No such file or directory OSError: libopenblas.so.0: cannot open shared object file: No such file or directory in mxnet 1.8.0 Mar 22, 2021
@szha
Copy link
Member

szha commented Mar 22, 2021

I think this is due to a change in CD that openblas is no longer statically linked in libmxnet. For now you can install openblas separately.

cc @mseth10 @leezu

@leezu
Copy link
Contributor

leezu commented Mar 22, 2021

@harupy it's unclear what you are doing. You need to provide more details.

@fhieber
Copy link
Contributor

fhieber commented Mar 22, 2021

We observe the same issue when trying to use pip-installed MXNet 1.8 (ubuntu, CPU): https://github.com/awslabs/sockeye/runs/2161521442?check_suite_focus=true

@leezu
Copy link
Contributor

leezu commented Mar 22, 2021

@mseth10 @access2rohit please take a look why the CD didn't package the libopenblas.so

tools/pip/setup.py includes instructions for copying libopenblas.so in v1.8.x, v1.x, and master. But apparently that didn't work in v1.8.x.

v1.8.x https://github.com/apache/incubator-mxnet/blob/a0535ddfb0246f53f7b851baf861fc06d3ff48c3/tools/pip/setup.py#L170-L172
v1.x https://github.com/apache/incubator-mxnet/blob/cfa1c890a7ecb8b5e29ff4e90d6784141f09c4cd/tools/pip/setup.py#L164-L166
master https://github.com/apache/incubator-mxnet/blob/4d706e8c19b3354878eda9467b149c0ce1fd6d47/tools/pip/setup.py#L165-L167

I suspect the issue is that libquadmath is mentioned in v1.8.x which is incorrect

@leezu
Copy link
Contributor

leezu commented Mar 22, 2021

The problem is that #19514 is missing on v1.8.x

@praneethkv
Copy link

My container suddenly started failing to build and upon looking, this was the error. I started using previous version which is 1.7.0.post2 and works perfectly

@szha
Copy link
Member

szha commented Mar 26, 2021

@praneethkv we are working on patching the wheels to fix the problem

@access2rohit
Copy link
Contributor

I think the issue is fixed now @praneethkv can you try again ?

@astonzhang
Copy link
Member

For anyone who encounters the same error when installing mxnet-cu101==1.8.0 or 1.8.0post0 and run import mxnet:

  1. Starting from version 1.8.0, CUDNN and NCCL should be installed as well. https://mxnet.apache.org/versions/master/get_started?platform=linux&language=python&processor=gpu&environ=pip&

  2. So you may install CUDNN and NCCL (NCCL 2):

conda install -c conda-forge cudnn
conda install -c conda-forge nccl   # It gives you nccl 2, nccl 1 won't work

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants