Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation crashes with undefined symbol: _ZSt28__throw_bad_array_new_lengthv (Fedora 34, gcc 11) #294

Closed
solbes opened this issue May 15, 2021 · 21 comments
Labels

Comments

@solbes
Copy link

solbes commented May 15, 2021

After installing pystan and running the 8schools example, the compilation crashes with error message

ImportError: /home/solant/.cache/httpstan/4.4.2/models/sk7xw5y6/stan_services_model_sk7xw5y6.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZSt28__throw_bad_array_new_lengthv

I'm running Fedora 34, I tried both the most recent python version and 3.7.10. Any idea what's wrong? I couldn't find anything by googling the above error.

@solbes solbes added the bug label May 15, 2021
@ahartikainen
Copy link
Contributor

Hi, what compiler (version) do you have?

@solbes
Copy link
Author

solbes commented May 15, 2021

gcc -v says I have gcc version 11.1.1 20210428 (Red Hat 11.1.1-1) (GCC)

@solbes
Copy link
Author

solbes commented May 15, 2021

Same happens with gcc version 11.0.1 ... Also was able to replicate in another machine.

@riddell-stan
Copy link
Contributor

You certainly should be able to get things working if you compile httpstan yourself (rather than using the PyPI wheel).

It's really unfortunate things do not work cleanly on the most recent version of Fedora. Makes me worried that gcc 11 might have introduced some problems that will eventually affect Debian-based linuxes.

@riddell-stan riddell-stan changed the title Compilation crashes with undefined symbol: _ZSt28__throw_bad_array_new_lengthv Compilation crashes with undefined symbol: _ZSt28__throw_bad_array_new_lengthv (Fedora 34, gcc 11) May 15, 2021
@solbes
Copy link
Author

solbes commented May 15, 2021

You certainly should be able to get things working if you compile httpstan yourself (rather than using the PyPI wheel).

It's really unfortunate things do not work cleanly on the most recent version of Fedora. Makes me worried that gcc 11 might have introduced some problems that will eventually affect Debian-based linuxes.

Thanks for a quick reply! I'll try compiling myself next...

@riddell-stan
Copy link
Contributor

@solbes Looks like pystan works when run on the fedora 34 docker image (using GitHub Actions' runner). Have you installed a new gcc or are you using something other than x86-64?

@solbes
Copy link
Author

solbes commented May 17, 2021

@solbes Looks like pystan works when run on the fedora 34 docker image (using GitHub Actions' runner). Have you installed a new gcc or are you using something other than x86-64?

Interesting... I haven't done any separate gcc installations, I'm using version 11.1.1. I have two machines, one AMD and one Intel, both have the same issue. I switched to CmdStanPy for now and got things running.

@riddell-stan
Copy link
Contributor

Thanks for the update. I'll spin up a fedora 34 VM and see if I can find out what is wrong.

@riddell-stan
Copy link
Contributor

riddell-stan commented May 17, 2021

I'm unable to reproduce this.

  • Eight schools works on Fedora 34, x86_64 architecture. I used the distribution's Python (3.9.4) in a venv.

I also re-ran things after doing a yum upgrade (which got me to Python 3.9.5). Eight schools works.

The error you're getting looks like a libstdc++ problem. I did a bit of searching and found this, JuliaLang/julia#40703 . Perhaps there's some connection?

@solbes
Copy link
Author

solbes commented May 18, 2021

Thanks for the effort! Really appreciated. I’ll try to dig deeper at some point...

@QuLogic
Copy link

QuLogic commented May 18, 2021

I don't use this, so this is a drive-by comment while I'm looking for the problem.

I see you used Python 3.7, which is not the default on Fedora 34, but your version of gcc is from Fedora. Are you using a conda environment? The libstdc++.so in conda is older than the one in Fedora, as far as I could find.

@solbes
Copy link
Author

solbes commented May 20, 2021

I don't use this, so this is a drive-by comment while I'm looking for the problem.

I see you used Python 3.7, which is not the default on Fedora 34, but your version of gcc is from Fedora. Are you using a conda environment? The libstdc++.so in conda is older than the one in Fedora, as far as I could find.

I tried different python versions, that 3.7 was the last one I tested. I am using conda, that’s a good guess!

@solbes
Copy link
Author

solbes commented May 20, 2021

@QuLogic indeed outside conda things work! Thanks a lot. FYI @riddell-stan

@joshua-cogliati-inl
Copy link

joshua-cogliati-inl commented Jun 29, 2021

I think this is the GCC commit that is causing this problem: gcc-mirror/gcc@f92a504

And basically, conda has libstdc++.so.6.0.28 but fedora 34 has libstdc++.so.6.0.29:

objdump -T /usr/lib64/libstdc++.so.6.0.29  | grep throw_bad_array
00000000000a423f g    DF .text  0000000000000035  GLIBCXX_3.4.29 _ZSt28__throw_bad_array_new_lengthv
00000000000a14ee g    DF .text  0000000000000035  CXXABI_1.3.8 __cxa_throw_bad_array_new_length
00000000000a13d3 g    DF .text  0000000000000033  CXXABI_1.3.8 __cxa_throw_bad_array_length

 objdump -T libstdc++.so.6.0.28 | grep throw_bad_array
00000000000a9e5d g    DF .text  000000000000002f  CXXABI_1.3.8 __cxa_throw_bad_array_new_length
00000000000a9dd0 g    DF .text  000000000000002f  CXXABI_1.3.8 __cxa_throw_bad_array_length

A temporary workaround until conda gets an updated libstdc++ is to go into your miniconda directory and force it to use the system libstdc++:

cd  ~/miniconda3/lib/
rm libstdc++.so libstdc++.so.6
ln -s /usr/lib64/libstdc++.so.6.0.29 libstdc++.so
ln -s /usr/lib64/libstdc++.so.6.0.29 libstdc++.so.6

@jjerphan
Copy link

jjerphan commented Jul 12, 2021

@joshua-cogliati-inl's work around is functional.

It's probably better to make modification in the virtual environment one is using to back-up previous shared objects:

cd $CONDA_PREFIX/envs/your_env/lib

# Verify that `_ZSt28__throw_bad_array_new_lengthv` is missing in the target shared object
objdump -T libstdc++.so.6.0.28 | grep throw_bad_array

# Back-up the target shared object
mv libstdc++.so.6.0.28 libstdc++.so.6.0.28.old

# Change the target to point on the system's 
ln -s /usr/lib64/libstdc++.so.6.0.29 libstdc++.so.6.0.28

jjerphan added a commit to jjerphan/scikit-learn that referenced this issue Jul 12, 2021
StdVectorSentinel makes a proper life-cycle
management for std::vectors' buffers possible.

Duplication seems needed as fused types
can't be used as attributes.

It's possible to obtain a missing symbol
(`_ZSt28__throw_bad_array_new_lengthv`) at
runtime.

This is unrelated to the implementation here,
and there are issues reporting the problem, e.g.:
cython/cython#4218.

A temporary workaround:
stan-dev/pystan#294 (comment)
jjerphan added a commit to jjerphan/scikit-learn that referenced this issue Jul 12, 2021
StdVectorSentinel makes a proper life-cycle
management for std::vectors' buffers possible.

Duplication seems needed as fused types
can't be used as attributes.

It's possible to obtain a missing symbol
(`_ZSt28__throw_bad_array_new_lengthv`) at
runtime.

This is unrelated to the implementation here,
and there are issues reporting the problem, e.g.:
cython/cython#4218.

A temporary workaround:
stan-dev/pystan#294 (comment)
@xhochy
Copy link

xhochy commented Dec 8, 2021

Seems like people still come from Google over to this issue with conda-related compiler issues. Thus I wanted to give a piece of brief information on how to fix these problems.

  • conda environments are fully self-contained, i.e. they are meant to be working fully independent from your operating system
  • conda will ship a different libstdc++ from your system. The version depends on which source for your package you choose. There are two main distributions using the conda package manager: Anaconda and conda-forge. They use a similar setup but vary in the versions of packages they provide.
  • Nowadays (different to the issue here), the libstdc++ should be newer than the one on your OS. Thus you would see a linker error when compiling compared to the ImportError here.
  • Thus the conda environments/distributions come with their own compiler toolchain. You can install them using conda install c-compiler cxx-compiler / mamba install c-compiler cxx-compiler. This will then set the environment variables CC and CXX to the provided compiler and also set CFLAGS, CXXFLAGS, LDFLAGS to build and link against all libraries in the environment (including the libstdc++).

@QuLogic
Copy link

QuLogic commented Dec 8, 2021

Nowadays (different to the issue here), the libstdc++ should be newer than the one on your OS. Thus you would see a linker error when compiling compared to the ImportError here.

That is backwards. The standard (C/C++) libraries have strong backwards compatibility. If you have a new one shadowing the system one, things will work with an old compiler from outside the environment. The problem arises when you have an old one in the environment shadowing the new one that the compiler outside the environment is expecting.

@xhochy
Copy link

xhochy commented Dec 9, 2021

Nowadays (different to the issue here), the libstdc++ should be newer than the one on your OS. Thus you would see a linker error when compiling compared to the ImportError here.

That is backwards. The standard (C/C++) libraries have strong backwards compatibility. If you have a new one shadowing the system one, things will work with an old compiler from outside the environment. The problem arises when you have an old one in the environment shadowing the new one that the compiler outside the environment is expecting.

Yes, that is why you will see a linker error if you compile with your system compiler and link against libraries coming from conda(-forge). These will use newer symbols but the system compiler tries to link with the system libstdc++ which doesn't have the symbols.

@Floating-nuaa
Copy link

Thanks!! I have the same error info but I use Cython to wrap C++ vector, remove anaconda environment but use other python can resolve my problem.

@dom96
Copy link

dom96 commented Sep 24, 2022

For others running into this that are not using conda: I resolved this by upgrading from Ubuntu 20.04 to Ubuntu 22.04.

@ricklentz
Copy link

I had the issue after upgrading from Ubuntu 20.04 to Ubuntu 22.04. I solved this by removing torch_scatter, updating conda (conda update --all) so the c-compiler and cxx-compiler versions of anaconda and system matched, and then installing torch_scatter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants