-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build RL8+OFED image in CI #427
Conversation
Might want to switch to forcing a reboot. /var/run/reboot-required isn't populated, despite a slightly newer kernel getting installed:
|
If I understood your point correctly, you were suggesting that why might want to reboot prior to building ofed? I don't think this is necessary as we can compile the kernel modules for a kernel we are not running (as long as we have the kernel headers). The new kernel and modules will be loaded when we boot the new image. Or is this during the ansible run of site.yml afterwards? |
Hmm, possibly, but to me that seems too risky/confusing. Especially given we may be e.g. installing CUDA afterwards which has kernel-* and ofed dependencies. We have to reboot on selinux changes anyway, so we are expecting a reboot somewhere during the build process. We just don't want to do it twice, if possible. I think doing this is appropriate:
|
Actually there's another reason: boostrap.yml runs both from site.yml and from fatimage.yml, which is good (as it reduces code duplication). So really, if we need a reboot we should do it, rather than having to determine we're running in a build and therefore can ignore that as we will get a "reboot" when creating an instance with the image ... Make sense? |
Yep, I missed that you run this against live nodes too. That makes more sense. Isn't /var/run/reboot-required a Debian/Ubuntuism? Redhat seem to suggest using: https://access.redhat.com/solutions/27943 |
Looks like you're right, explains why it wasn't working then :P |
4.18.0-553.16.1.el8_9.x86_64 4.18.0-553.el8_9.x86_64 These would fail with the error: '<' not supported between instances of 'str' and 'int'. as the community.general.version_sort was trying to compare the `el8_9` of the latter with the `16` of the former. Strip the last two chunks so we just compare numbers.
e75adaf
to
374c42c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to match what steve specified to me 👍
370deda
to
634434b
Compare
Checked at 53baab2 that a reboot didn't happen after the CI reimage - i.e the conditional on the reboot appears to be working OK again. |
Checked reboot did happen during image build above |
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks good to me
* Check major version for RL8 package installs * Gather facts on ofed role * Support kernel checks with mismatching version length 4.18.0-553.16.1.el8_9.x86_64 4.18.0-553.el8_9.x86_64 These would fail with the error: '<' not supported between instances of 'str' and 'int'. as the community.general.version_sort was trying to compare the `el8_9` of the latter with the `16` of the former. Strip the last two chunks so we just compare numbers. * Move to LTS version now RL9.4 is supported * Fail when any inventory source cannot be parsed * Always reboot after selinux and package updates * Cleat facts before OFED so install will match newest kernel * Clear facts after reboot so OFED install will match newest kernel * fail caas and stackhpc if any inventory can't be read * make reboot conditional on package or SELinux changes again * include OFED in both RL8 and RL9 builds * always run CI tests on RL8 and RL9 * allow concurrent RL8/RL9 CI tests * mark pending reboot check as not a change * fix workflow matrix definitions * bump CI images - now both OFED * use reboot hint for checking reboot required --------- Co-authored-by: Steve Brasier <steveb@stackhpc.com>
Uses OFED for both RockyLinux 8 and 9 image builds.
ofed_distro_version
was e.g. 8.9 due to the base image despite the dnf update step having upgraded the distro to something later.