-
-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opencv: misc CUDA-related updates and fixes; add enableLto #218044
Closed
ConnorBaker
wants to merge
185
commits into
NixOS:master
from
ConnorBaker:feat/opencv-use-cudaPackages
Closed
opencv: misc CUDA-related updates and fixes; add enableLto #218044
ConnorBaker
wants to merge
185
commits into
NixOS:master
from
ConnorBaker:feat/opencv-use-cudaPackages
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit updates the `buildFlags`, which is a single string with one of four possibilities: - "" - "profiled" - "bootstrap" - "profiledbootstrap" Previously only the last two were possible. Since 2ea3482 all four are possible.
ConnorBaker
force-pushed
the
feat/opencv-use-cudaPackages
branch
2 times, most recently
from
February 25, 2023 04:10
e3b62ad
to
c70cc93
Compare
The primary motivating example is openssl: Before the change full package build took 1m54s minutes. After the change full package build takes 59s. About a 2x speedup. The difference is visible because openssl builds hundreds of manpages spawning a perl process per manual in `install` phase. Such a workload is very easy to parallelize. Another example would be `autotools`+`libtool` based build system where install step requires relinking. The more binaries there are to relink the more gain it will be to do it in parallel. The change enables parallel installs by default only for buiilds that already have parallel builds enabled. There is a high chance those build systems already handle parallelism well but some packages will fail. Consistently propagated the enableParallelBuilding to: - cmake (enabled by default, similar to builds) - ninja (set parallelism explicitly, don't rely on default) - bmake (enable when requested) - scons (enable when requested) - meson (set parallelism explicitly, don't rely on default) - waf (set parallelism explicitly, don't rely on default) - qmake-4/5/6 (enable by default, similar to builds) - xorg (always enable, similar to builds)
Without the change install phase fails as: installing install flags: -j16 ... ... ./.libs/libnetsnmpagent.so: file not recognized: file format not recognized collect2: error: ld returned 1 exit status make[1]: *** [Makefile:1012: libnetsnmpmibs.la] Error 1 make[1]: *** Waiting for unfinished jobs....
Without the change install phase fails as: Installing libxfs-install ../../install-sh -o nixbld -g nixbld -m 644 ioctl_xfs_ag_geometry.2 /nix/store/chymzkiiv6c2rgl2gqrn4bqv5azhx9vf-xfsprogs-6.1.1-bin/share/man/man2/ioctl_xfs_ag_geometry.2 make[1]: *** No rule to make target '\', needed by 'kmem.lo'. Stop. make[1]: *** Waiting for unfinished jobs.... make: *** [Makefile:148: libxfs-install] Error 2 make: *** Waiting for unfinished jobs....
Without the change parallel install fails as: install flags: -j16 ... libbtool: error: error: relink '_py3sss.la' with the above command before installing it libtool: warning: '/build/source/libsss_cert.la' has not been installed in '/nix/store/apyk9a6q7bc7d1fnn81vqrwil4waw9cd-sssd-2.8.2/lib/sssd' make[3]: *** [Makefile:13362: install-py3execLTLIBRARIES] Error 1
now gcc isn't built
Without the change parallel install fails as: $ install flags: -j16 ... ... collect2: error: ld returned 1 exit status libtool: error: error: relink 'libsvn_ra_serf-1.la' with the above command before installing it make: *** [build-outputs.mk:1316: install-serf-lib] Error 1 make: *** Waiting for unfinished jobs.... /nix/store/1qasgqvab0xh2jcy00x9b1zh39dw7m8f-bin
Without the change parallel install fails as: $ install flags: -j16 ... ... install: target '...-ocaml-4.14.0/lib/ocaml/threads': No such file or directory make[1]: *** [Makefile:140: installopt] Error 1
Without the change parallel installs fail as: install flags: -j2 ... ln: failed to create symbolic link '...-eresi-0.83-a3-phoenix//bin/elfsh': No such file or directory make: *** [Makefile:108: install64] Error 1
w3m: 0.5.3+git20220429 -> 0.5.3+git20230121
libpcap: 1.10.1 -> 1.10.3
- use cudaPackages instead of cudatoolkit (reduces download/closure size) - set C/C++ compiler when building with CUDA to ensure NVCC has an appropriate backing compiler - add flag to build with CUDNN (disabled by default due to increase in closure size) - add flag to build with LTO (enabled by default)
ConnorBaker
force-pushed
the
feat/opencv-use-cudaPackages
branch
from
March 15, 2023 19:40
1bd932a
to
13d80db
Compare
ConnorBaker
requested review from
roberth,
ttuegel,
matthewbauer,
Mic92,
zowoq,
winterqt,
figsoda,
marsam,
a team and
Ericson2314
as code owners
March 15, 2023 19:40
github-actions
bot
added
6.topic: nixos
6.topic: ocaml
6.topic: python
6.topic: qt/kde
6.topic: ruby
6.topic: rust
6.topic: stdenv
Standard environment
6.topic: systemd
8.has: changelog
8.has: documentation
8.has: module (update)
labels
Mar 15, 2023
Please reopen, we can't unping people. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
6.topic: closure-size
6.topic: cuda
6.topic: nixos
6.topic: ocaml
6.topic: python
6.topic: qt/kde
6.topic: ruby
6.topic: rust
6.topic: stdenv
Standard environment
6.topic: systemd
8.has: changelog
8.has: documentation
8.has: module (update)
10.rebuild-darwin: 101-500
10.rebuild-linux: 501-1000
10.rebuild-linux: 501+
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes
Info on closure sizes: these are the result of before (compiling from master) and after (my PR, with and without CUDNN support).
Before:
Full closure:
After (without CUDNN, which is the default):
Full closure:
After (with CUDNN):
Full closure:
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)