Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit instantiation of TypeID is incompatible with building with -BSymbolic #50762

Open
berolinux opened this issue Aug 9, 2021 · 23 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla mlir:core MLIR Core Infrastructure

Comments

@berolinux
Copy link

Bugzilla Link 51420
Version unspecified
OS Linux
Blocks #51489
CC @gysit,@MaskRay,@joker-eph,@jpienaar,@MaheshRavishankar,@River707,@tstellar
Fixed by commit(s) c3aed0d d5159b9

Extended Description

Building LLVM 13.0-rc1 with MLIR enabled fails:

cd /home/bero/abf/llvm/BUILD/llvm-project-release-13.x/build/tools/mlir/include/mlir/Dialect/Linalg/IR && /home/bero/abf/llvm/BUILD/llvm-project-release-13.x/build/bin/mlir-linalg-ods-yaml-gen /home/bero/abf/llvm/BUILD/llvm-project-release-13.x/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml -o-ods-decl=/home/bero/abf/llvm/BUILD/llvm-project-release-13.x/build/tools/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yamlgen.td -o-impl=/home/bero/abf/llvm/BUILD/llvm-project-release-13.x/build/tools/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yamlgen.cpp.inc
YAML:19:16: error: could not parse as an affine map attribute
shape_map: affine_map<()[s0, s1, s2] -> (s0, s1)>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@berolinux
Copy link
Author

assigned to @joker-eph

@berolinux
Copy link
Author

still happens in 13.0-rc3

@joker-eph
Copy link
Collaborator

Nicolas: can you look into this?

@gysit
Copy link
Contributor

gysit commented Sep 17, 2021

Hi, I tried to reproduce the problem using the following commands:

git clone https://github.com/llvm/llvm-project.git
cd llvm-project
git checkout tags/llvmorg-13.0.0-rc3 -b llvmorg-13.0.0-rc3
mkdir build
cmake -GNinja ../llvm
-DCMAKE_C_COMPILER=/usr/bin/clang
-DCMAKE_CXX_COMPILER=/usr/bin/clang++
-DLLVM_ENABLE_PROJECTS=mlir
-DLLVM_TARGETS_TO_BUILD=X86
-DMLIR_INCLUDE_INTEGRATION_TESTS=ON
-DLLVM_ENABLE_ASSERTIONS=ON
-DBUILD_SHARED_LIBS=ON
-DLLVM_INCLUDE_UTILS=ON
-DLLVM_BUILD_EXAMPLES=ON
-DLLVM_ENABLE_LLD=ON
-DLLVM_CCACHE_BUILD=ON
-DLLVM_OPTIMIZED_TABLEGEN=ON
-DCMAKE_BUILD_TYPE=Release
ninja all
ninja check-mlir

Building and running the mlir tests works fine for me. I hope I did catch the right commit? Also can you share your cmake configuration in case this is an artifact of a particular setup?

Best
Tobias

@berolinux
Copy link
Author

My failing build is part of a much larger build (all components, all targets). Will try to reproduce with fewer stuff and isolate what's triggering the breakage.

@berolinux
Copy link
Author

Confirmed that it builds for me in the minimal config as well

@joker-eph
Copy link
Collaborator

There is something fishy somewhere, the tool build/bin/mlir-linalg-ods-yaml-gen fails to parse a YAML file that is committed in the repo.
This tool does not have any dependency on other projects (I expect it to only depend on LLVM libSupport actually).
Do you have a cmake invocation that reliably fail for you Bernhard?
If you can also provide more info about the environment (OS, arch).
Thanks!

@berolinux
Copy link
Author

I've narrowed it down -

@berolinux
Copy link
Author

it's not a serious problem after all, it happens if the tools are built with "-Wl,-Bsymbolic".

Simply optimizing a little less and not making the assumption that this is safe fixes the problem.

Still might be worth documenting somewhere. I don't see an obvious reason why -Wl,-Bsymbolic would break this.

@gysit
Copy link
Contributor

gysit commented Sep 20, 2021

Thanks for analyzing.

I can confirm that adding -Wl,-Bsymbolic seems problematic (to CMAKE_MODULE_LINKER_FLAGS and CMAKE_SHARED_LINKER_FLAGS). It still builds for me but almost all mlir tests fail.

I lack the linker / cmake expertise to understand the root cause. Looking at the error messages it could be that type uniquing does not work anymore.

A solution could be documentation or even better adding a check in the cmake file to emit a warning if the flag is set. Let me know if you know a reliable way of doing that?

Tobias

@joker-eph
Copy link
Collaborator

Ah right, -BSymbolic will likely break our TypeID mechanism I believe: we still rely in a few places on template instantiated function having the same address for the entire program.

@MaskRay
Copy link
Member

MaskRay commented Sep 20, 2021

cmake -GNinja -Hllvm -Bout/release -DCMAKE_C_COMPILER=/Stable/bin/clang -DCMAKE_CXX_COMPILER=/Stable/bin/clang++ -DLLVM_ENABLE_PROJECTS=mlir -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_ASSERTIONS=ON -DBUILD_SHARED_LIBS=ON -DLLVM_ENABLE_LLD=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-Bsymbolic # check-mlir-unit fails
With -Wl,-Bsymbolic-functions check-mlir-unit passes.

This is a pointer equality for variables problem, likely related to TypeID
See https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic for details.

(I ruled out the copy relocations because with
-DCMAKE_CXX_FLAGS=-fno-direct-access-external-data -DCMAKE_C_FLAGS=-fno-direct-access-external-data -DCMAKE_SHARED_LINKER_FLAGS=-Wl,-Bsymbolic
there is copy relocation (readelf -r bin/mlir-translate | grep COPY) but check-mlir-unit still fails.
)

Just stick with -Wl,-Bsymbolic-functions.

@MaskRay
Copy link
Member

MaskRay commented Sep 20, 2021

there is copy relocation

typo. there is no copy relocation

@joker-eph
Copy link
Collaborator

This is a pointer equality for variables problem

Right: the templated function returns the address of a local static variable.
The address of the function itself isn't important, only the one it returns (so the static variable)

@joker-eph
Copy link
Collaborator

We can't fix this in time for LLVM 13 release. It is also not a regression from LLVM 12 or 11 I believe.

@joker-eph
Copy link
Collaborator

However we could error at the CMake level if Bsymbolic is set.

@joker-eph
Copy link
Collaborator

Sent https://reviews.llvm.org/D110483 to do just that

@tstellar
Copy link
Collaborator

Sent https://reviews.llvm.org/D110483 to do just that

Should we backport this to the release/13.x branch?

@joker-eph
Copy link
Collaborator

It would be "nice to have", feel free to cherry-pick this!

@tstellar
Copy link
Collaborator

tstellar commented Nov 8, 2021

Merged: d5159b9

@joker-eph
Copy link
Collaborator

The bug isn't fixed in the sense that we don't support it. The commit back ported is just error detection at CMake time.

@tstellar
Copy link
Collaborator

mentioned in issue #51489

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
@asl asl added this to the LLVM 13.0.1 release milestone Dec 12, 2021
@tstellar
Copy link
Collaborator

I'm going to remove this from the Release 13.0.1 Milestone since the release branch portion of this issue has been done.

@tstellar tstellar removed this from the LLVM 13.0.1 release milestone Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla mlir:core MLIR Core Infrastructure
Projects
None yet
Development

No branches or pull requests

6 participants