-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{lib}[foss/2021b] TensorFlow v2.8.4 w/ CUDA-11.4.1 and fix patches + extensions in easyconfig for TensorFlow 2.8.4 w/ foss/2021b #17058
Conversation
3351849
to
5acbdd2
Compare
Test report by @Flamefire |
Test report by @jfgrimm |
@jfgrimm The CUDA build requires the new easyblock from easybuilders/easybuild-easyblocks#2854 I made that more clear in the description but your test verifies that indeed (only) the CUDA build requires the update (but doesn't hurt the CPU variant) |
yeah I read that, then forgot to add the |
Test report by @jfgrimm |
eadb4b6
to
9596c49
Compare
Test report by @akesandgren |
Test report by @branfosj |
After #17892 I'm not sure the replacement of the fix-cuda-build patch by the upstream patch is a good idea as it can cause regressions when using rpath wrappers or ccache, see #17892 (comment) for how fixing the rpath issue is possible from easybuild-framework (but not the ccache use case) Given the good state of this PR (the only failing report is due to lack of space) I'd suggest to merge this anyway and then do a bulk-update of the TensorFlow ECs (re-)adding the fix-cuda-build patch (potentially additionally to the upstream patch) and doing a single test report with |
Going in, thanks @Flamefire! |
Test report by @branfosj |
This adds the new TF 2.8.4 EC with CUDA support and some changes for the non-CUDA version:
Add missing
astor
package (SYSTEM_LIB, i.e. will be downloaded when missing which we want to avoid)Add the profile plugin for tensorboard as we did in previous ECs
Reorder sources, patches, checksums so they are next to each other for easier manual verification
Replace
TensorFlow-2.1.0_fix-cuda-build.patch
by the upstream patch from resolve gcc_host_compiler_path in a symlink directory tensorflow/tensorflow#56360 so we can drop it in the futureFix build on PPC (that was some major work)
the CUDA build requires the updated easyblock from fix TensorFlow easyblock for new versions of Bazel & TensorFlow easybuild-easyblocks#2854 to work due to changes in TensorFlow and Bazel.
Once this is merged we can finish TF 2.9 in #16620 and #16008 as they suffer from the same issues as the ones fixed here.