Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intel-graphics-compiler is not being fetched from the cache and the compilation fails #169729

Closed
leiserfg opened this issue Apr 22, 2022 · 19 comments · Fixed by #171656
Closed

intel-graphics-compiler is not being fetched from the cache and the compilation fails #169729

leiserfg opened this issue Apr 22, 2022 · 19 comments · Fixed by #171656

Comments

@leiserfg
Copy link
Contributor

Describe the bug

I use intel-compute-runtime to provide opencl to darktable and it's not being fetched from the cache since today 2022-04-22
Also, compiling intel-graphics-compiler (which is a dependency) fails so I had to remove it altogether.

Steps To Reproduce

Steps to reproduce the behavior:
nix-env -i intel-compute-runtime

Expected behavior

Get's installed from cache

Notify maintainers

@gloaming

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.15.32-1-MANJARO, Manjaro Linux, noversion, rolling`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.7.0`
 - channels(leiserfg): `"home-manager, nixpkgs"`
 - nixpkgs: `/home/leiserfg/.nix-defexpr/channels/nixpkgs`
@leiserfg
Copy link
Contributor Author

When running the line I see:

....
building '/nix/store/1m1mpsxqq3ld8hwa79nbp0178whzv1vd-igc-cclang-prebuilds.drv'...
building '/nix/store/bnyf5mklkz2gx7dhlhfb15rfpsk272nw-intel-graphics-compiler-1.0.8744.drv'...
.....

@wentasah
Copy link
Contributor

The problem is in this:

/build/source/IGC/Compiler/Optimizer/OpenCLPasses/JointMatrixFuncsResolutionPass.cpp: In function 'llvm::Type* getIntegerEquivalent(llvm::Type*)':
/build/source/IGC/Compiler/Optimizer/OpenCLPasses/JointMatrixFuncsResolutionPass.cpp:213:51: error: 'this' pointer is null [8;;https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wnonnull-Werror=nonnull8;;]
  213 |         unsigned size = matTy->getScalarSizeInBits();
      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~^~
In file included from /nix/store/ida6b9zgxsw5xa7z8s1y0n42z60lzw2q-llvm-11.1.0-dev/include/llvm/IR/DerivedTypes.h:23,
                 from /nix/store/ida6b9zgxsw5xa7z8s1y0n42z60lzw2q-llvm-11.1.0-dev/include/llvm/IR/DataLayout.h:26,
                 from /nix/store/ida6b9zgxsw5xa7z8s1y0n42z60lzw2q-llvm-11.1.0-dev/include/llvm/IR/Module.h:25,
                 from /build/source/IGC/common/igc_resourceDimTypes.h:12,
                 from /build/source/IGC/Compiler/CodeGenPublic.h:18,
                 from /build/source/IGC/Compiler/Optimizer/OpenCLPasses/JointMatrixFuncsResolutionPass.h:11,
                 from /build/source/IGC/Compiler/Optimizer/OpenCLPasses/JointMatrixFuncsResolutionPass.cpp:9:

It seams that upstream has fixed it long ago: intel/intel-graphics-compiler#209

Upgrading at least to 1.0.9289 could fix the problem.

@joshrule
Copy link

I get the same error:

$ nix-info -m
- system: `"x86_64-linux"`
- host os: `Linux 5.16.12, NixOS, 22.05 (Quokka), 22.05pre358986.062a0c5437b`
- multi-user?: `yes`
- sandbox: `yes`
- version: `nix-env (Nix) 2.6.1`
- channels(rule): `"home-manager, nixos"`
- channels(root): `"nixos, nixos-hardware"`
- nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Any workarounds?

@leiserfg
Copy link
Contributor Author

Seems like @gloaming is inactive.

@calbrecht
Copy link
Member

Upgrading intel-graphics-compiler and therefore intel-compute-runtime, spirv-llvm-translator, spirv-tools, spirv-headers, opencl-lang and glslang seems to do the trick for me https://github.com/calbrecht/f4s-fixups/blob/main/flake.nix#L22-L179

BUT, i don't really have a clue what i'm doing...

@leiserfg
Copy link
Contributor Author

leiserfg commented May 1, 2022

@calbrecht so probably we just need someone to update the version here.

@calbrecht
Copy link
Member

@leiserfg that is indeed inevitable, i guess. I had to do some trickery regarding the compilation of intel-graphics-compiler and opencl-clang, else it would segfault when compiling intel-compute-runtime with something like

CommandLine Error: Option 'spirv-text' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

which seems to happen when two different clang compilations are present in the build environment. (guess this is the case when building opencl-clang with buildWithPatches)

intel/compute-runtime#519, llvm ldd related on stackoverflow, llvm clang related on stackoverflow, and similar in #97401

Now that i have the new versions running, i tried to somehow confirm a working solution and ran clinfo like in nix run nixpkgs#clinfo and it coredumped with that same message as well, so i'm uncertain if this is caused by my changes or if it failed before that anyway.

Is clinfo coredumping for you as well right now, even without the updated versions?

@calbrecht
Copy link
Member

Hmm, just tried to run clinfo in my previous NixOS generation. It gave some output and did not segfault.
So my current approach is compiling but seems to not work correctly.

@calbrecht
Copy link
Member

calbrecht commented May 2, 2022

Woah, intel-graphics-compiler had

  includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@/igc
  libdir=${exec_prefix}/@CMAKE_INSTALL_LIBDIR@

in their IGC/AdaptorOCL/igc-opencl.pc.in.

This produces a pkgconfig of e.g.

  includedir=${prefix}//nix/store/xvanybv9lfxc24hmqgbp43g52j5j6q68-intel-graphics-compiler-1.0.11061/include/igc
  libdir=${exec_prefix}//nix/store/xvanybv9lfxc24hmqgbp43g52j5j6q68-intel-graphics-compiler-1.0.11061/lib

So we need to patch with something like

substituteInPlace ./IGC/AdaptorOCL/igc-opencl.pc.in \
  --replace '/@CMAKE_INSTALL_INCLUDEDIR@' "/include" \
  --replace '/@CMAKE_INSTALL_LIBDIR@' "/lib"

Another case for #144170

@calbrecht
Copy link
Member

So far intel-graphics-compiler is building, but the depender intel-compute-runtime is failing to build its test suite, because of probably? spirv-tools is linked statically into spirv-llvm-translator, a dependee of intel-graphics-runtime.

Which probably leads to libLLVM being statically present and hence causing this segfault?

See also
KhronosGroup/SPIRV-Tools#3626
KhronosGroup/SPIRV-Tools#3909

@calbrecht
Copy link
Member

Now this https://github.com/calbrecht/f4s-fixups/blob/main/flake.nix#L22-L225 builds and clinfo does not segfault any longer.

I gues the main trick was to add -DBUILD_SHARED_LIBS=YES to spirv-llvm-translators cmakeFlags to make it compile a shared libLLVMSPIRVLib.so instead of the static libLLVMSPIRVLib.a.

It seems like the static libLLVMSPIRVLib.a led to the segfault by including libLLVM statically into the build and when calling functions of libLLVM it got confused about its split personality? i don't know.

@rowanG077
Copy link
Member

So I was allready working on a PR for this and I only just now found this thread. The approach I took is similar to @calbrecht but I didn't update spirv-tools, spirv-headers and glslang since the versions in nixpkgs are already newer. Once I verified it works for me I can send in a PR. I was also the one who did the last update.

@calbrecht
Copy link
Member

@rowanG077, beautiful :)

Did you also find the switch buildWithPatches in opencl-clang and spirv-llvm-translator being unnecessary?

Also i noticed some of the existing cmakeFlags do not apply any longer and need to be cleaned up.

The lit llvm testing tool needs some more work though, got it to work for the static libLLVMSPIRVLib.a version of spirv-llvm-translator but not for the dynamic one.

@calbrecht
Copy link
Member

I'm curious if this is going to work correctly with newer versions of spirv-tools and spirv-headers, i found those being restricted by the release description of intel-graphics-runtime at https://github.com/intel/intel-graphics-compiler/releases/tag/igc-1.0.11061

@rowanG077
Copy link
Member

rowanG077 commented May 4, 2022

I didn't even touch lit. Intel tools always pin a very specific version. Honestly it's quite bad practice from Intel but unless it's really necessary I don't want to downgrade them. They were updated by someone for a reason. If they won't work we would need to duplicate them which is also not so nice. For this same reason I don't want to touch the patches. It might be that IGC doesn't need them. But they were included in the project for a reason.

In fact if you look at the current master branch there are no patches anymore. It's always a balance since you never know what will depend on these packages in the future. I'd rather keep the changes to a minimum.

@calbrecht
Copy link
Member

Ahh, thanks for the clarification, now i see why i couldn't make sense of the patching mechanism. :)

@calbrecht
Copy link
Member

@rowanG077 it's a bit offtopic, but i'm curious to know.
Do you understand and care to explain to me some details of why this segfault happened when the libLLVMSPIRVLib.a was present and linked/loaded into the build of intel-compute-runtime within the test run or at runtime through clinfo?

@rowanG077
Copy link
Member

@calbrecht Honestly I have no idea. I know there were some idiosyncrasies with the spirv-tools and linking. See here; KhronosGroup/SPIRV-Tools#3214. But since I didn't see this issue I don't really know why it happened. I would have to investigate myself.

@rowanG077
Copy link
Member

PR is out @calbrecht #171656 my system runs fine with the additions. clinfo works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants