Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rocky 9.4 Base Os confluent/Slurm Edition for linux #2002

Merged
merged 1 commit into from
Oct 17, 2024

Conversation

tkucherera-lenovo
Copy link
Contributor

This is a recipe that uses confluent for cluster provisioning.

Assumptions

  1. DNS is setup
  2. There is at least one SSH key on the SMS (the key is used for passwordless login on the nodes)

Note

  1. The makerepo.sh file does not check for Rocky Linux so had to modify it to check if os is Rocky
  2. the ohpc repo and epel repos directories are in /var/lib/confluent/public whereas on the compute nodes they can be reached via web root confluent-public

@adrianreber
Copy link
Member

Thanks, this is great. I will try it out on our CI systems.

@adrianreber
Copy link
Member

1. DNS is setup

We have /etc/hosts. I hope that is enough.

2. There is at least one SSH key on the SMS (the key is used for passwordless login on the nodes)

That is also needed for all other recipes. So, no problem.

@adrianreber
Copy link
Member

The resulting RPMS can be found in the GitHub Actions for the next 24 hours.

Copy link

github-actions bot commented Aug 7, 2024

Test Results

18 files   -  6  18 suites   - 6   27s ⏱️ -1s
53 tests  - 10  49 ✅  - 3  4 💤  -  7  0 ❌ ±0 
66 runs   - 20  62 ✅  - 6  4 💤  - 14  0 ❌ ±0 

Results for commit 3abbea1. ± Comparison against base commit 611b01f.

This pull request removes 10 tests.
conman ‑ [ConMan] Verify conman binary available
conman ‑ [ConMan] Verify man page availability
conman ‑ [ConMan] Verify rpm version matches binary
ipmitool ‑ [OOB] ipmitool exists
ipmitool ‑ [OOB] ipmitool local bmc ping
ipmitool ‑ [OOB] ipmitool power status
ipmitool ‑ [OOB] ipmitool read CPU1 sensor data
ipmitool ‑ [OOB] ipmitool read sel log
ipmitool ‑ [OOB] istat exists
warewulf-ipmi ‑ [warewulf-ipmi] ipmitool lanplus protocol

♻️ This comment has been updated with latest results.

@tkucherera-lenovo
Copy link
Contributor Author

Yes, /etc/hosts should be enough

\input{common/install_ohpc_components_intro}

\subsection{Enable \OHPC{} repository for local use} \label{sec:enable_repo}
\input{common/enable_local_ohpc_repo_confluent}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of the history behind this line from the xcat recipe.In all other recipes we enable the OpenHPC repository by installing the OpenHPC release RPM which enabled a dnf repository from the OpenHPC repository server. Hardcoding the downloading of the repository tar files feels unnecessary especially as we do not do it at all in any of our current testing. Please try to work with the online repository if that would work for you.

If you need it for your testing we should put it behind some variable, so that it can be disabled.

Is this strictly necessary for you or can you work with the online repository?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure noted, l will look to work with the online repo.

\subsubsection{Build initial BOS image} \label{sec:assemble_bos}
The following steps illustrate the process to build a minimal, default image for use with \Confluent{}. To begin, you will
first need to have a local copy of the ISO image available for the underlying OS. In this recipe, the relevant ISO image
is \texttt{Rocky-9.4-x86\_64-dvd1.iso} (available from the Rocky
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image I downloaded does not have a "1" in the file name. The filename should be a variable so that it can be easily updated.

@adrianreber
Copy link
Member

The main point which is currently not clear to me is if Confluent comes with a DHCP server? I was running the script a couple of times and the two compute nodes were always waiting for DHCP answers in the PXE boot step of the firmware.

@adrianreber
Copy link
Member

@tkucherera-lenovo Should I try again? Is there now a DHCP server configured, somehow?

For the final merge you can squash the commits. For the main repository it makes no sense to keep your development history with fixups. If you want you do separate commits for the docs/ part and the components/ part, that would make sense to me.

Please also add a Signed-off-by to your commit messages as described in https://github.com/openhpc/ohpc/blob/3.x/CONTRIBUTING.md. git commit -s usually does that automatically.

@tkucherera-lenovo
Copy link
Contributor Author

Yes, you can try again. Confluent does have its own dhcp server and by default it will respond to DHCP requests. If an environment has its own DHCP server, it is possible to configure confluent to not respond to DHCP requests. In this case though l believe there was a bug where the setting for allowing deployment using pxe was not being set because the variable needed was missing from the input.local file l have added a fix for that now.

going forward l will squash all commits and also add the signed-off-by to commits

@adrianreber
Copy link
Member

@tkucherera-lenovo Is there an easy way to reset the host machine without reinstalling. Where does confluent store its state? Is there a directory I can delete to start from scratch?

@tkucherera-lenovo
Copy link
Contributor Author

The state is stored /etc/confluent/*. So stopping confluent and running rm -rf /etc/confluent/*. l would also recommend removing the os profile under /var/lib/confluent/public/os dir

@adrianreber
Copy link
Member

Now I see that the compute nodes are trying to boot:

==> audit <==
Aug 10 10:43:07 {"operation": "update", "target": "/noderange/compute/boot/nextdevice", "allowed": true}
Aug 10 10:43:12 {"operation": "update", "target": "/noderange/compute/power/state", "allowed": true}

==> events <==
Aug 10 10:46:16 {"info": "Offering PXE boot with static address 10.241.58.133 to c2"}
Aug 10 10:46:18 {"info": "Offering PXE boot with static address 10.241.58.132 to c1"}
Aug 10 10:46:25 {"info": "Offering PXE boot with static address 10.241.58.133 to c2"}
Aug 10 10:46:28 {"info": "Offering PXE boot with static address 10.241.58.132 to c1"}

==> /var/log/httpd/access_log <==
10.241.58.133 - - [10/Aug/2024:10:46:32 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot.ipxe HTTP/1.1" 200 227 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.133 - - [10/Aug/2024:10:46:32 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/kernel HTTP/1.1" 200 13605704 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.133 - - [10/Aug/2024:10:46:32 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/addons.cpio HTTP/1.1" 200 97792 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.133 - - [10/Aug/2024:10:46:32 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/site.cpio HTTP/1.1" 200 3072 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.133 - - [10/Aug/2024:10:46:32 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/distribution HTTP/1.1" 200 106800744 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.132 - - [10/Aug/2024:10:46:34 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot.ipxe HTTP/1.1" 200 227 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.132 - - [10/Aug/2024:10:46:34 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/kernel HTTP/1.1" 200 13605704 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.132 - - [10/Aug/2024:10:46:35 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/addons.cpio HTTP/1.1" 200 97792 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.132 - - [10/Aug/2024:10:46:35 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/site.cpio HTTP/1.1" 200 3072 "-" "iPXE/1.21.1 (g988d2)"
10.241.58.132 - - [10/Aug/2024:10:46:35 +0000] "GET /confluent-public/os/rocky-9.4-x86_64-default/boot/initramfs/distribution HTTP/1.1" 200 106800744 "-" "iPXE/1.21.1 (g988d2)"

But after that nothing seems to happen. On the console I see:

image

Any recommendations how to continue?

Also there seems to be no point during the installation where the script waits for the compute nodes to be ready, so most commands are run when the compute nodes are not available. All the customization fails with:

+ nodeshell compute echo '"10.241.58.134:/home' /home nfs nfsvers=3,nodev,nosuid 0 '0"' '>>' /etc/fstab
c1: ssh: connect to host c1 port 22: No route to host
c2: ssh: connect to host c2 port 22: No route to host

@adrianreber
Copy link
Member

Now the installation is working but now it fails in post-installation scripts. I see on the server following error:

Aug 11 09:04:06 Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 193, in sync_list_to_node
    sshutil.prep_ssh_key('/etc/confluent/ssh/automation')
  File "/opt/confluent/lib/python/confluent/sshutil.py", line 139, in prep_ssh_key
    subprocess.check_output(['ssh-add', keyname], stdin=devnull, stderr=devnull)
  File "/usr/lib64/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib64/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ssh-add', '/etc/confluent/ssh/automation']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 612, in resourcehandler
    for rsp in resourcehandler_backend(env, start_response):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 635, in resourcehandler_backend
    for res in selfservice.handle_request(env, start_response):
  File "/opt/confluent/lib/python/confluent/selfservice.py", line 520, in handle_request
    result = syncfiles.start_syncfiles(
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 321, in start_syncfiles
    syncrunners[nodename].wait()
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 181, in wait
    return self._exit_event.wait()
  File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 132, in wait
    current.throw(*self._exc)
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 221, in main
    result = function(*args, **kwargs)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 215, in sync_list_to_node
    raise Exception("Syncing failed due to unreadable files: " + ','.join(unreadablefiles))
Exception: Syncing failed due to unreadable files: /tmp/tmpf19vlfb1/etc/shadow
Aug 11 09:04:07 Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 193, in sync_list_to_node
    sshutil.prep_ssh_key('/etc/confluent/ssh/automation')
  File "/opt/confluent/lib/python/confluent/sshutil.py", line 139, in prep_ssh_key
    subprocess.check_output(['ssh-add', keyname], stdin=devnull, stderr=devnull)
  File "/usr/lib64/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib64/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ssh-add', '/etc/confluent/ssh/automation']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 612, in resourcehandler
    for rsp in resourcehandler_backend(env, start_response):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 635, in resourcehandler_backend
    for res in selfservice.handle_request(env, start_response):
  File "/opt/confluent/lib/python/confluent/selfservice.py", line 520, in handle_request
    result = syncfiles.start_syncfiles(
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 321, in start_syncfiles
    syncrunners[nodename].wait()
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 181, in wait
    return self._exit_event.wait()
  File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 132, in wait
    current.throw(*self._exc)
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 221, in main
    result = function(*args, **kwargs)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 215, in sync_list_to_node
    raise Exception("Syncing failed due to unreadable files: " + ','.join(unreadablefiles))
Exception: Syncing failed due to unreadable files: /tmp/tmp2qhhryvt/etc/shadow

@tkucherera-lenovo
Copy link
Contributor Author

Hi, Adrian l don't know what state the management server and cluster are in. But usually, the error that l seeing happens when the automation SSH key is missing from the /etc/confluent/ssh directory. This key should have been created during the osdeploy initialize step. The input.local file should have an initialize_options variable with the value usklpta where the a option creates the key in question.

Additionally just to be able to help me with debug. If you the command:

  1. confluent_selfcheck -n <nodename>

That output is sometimes helpful in debug. Thanks.

[sms](*\#*) mkdir -p $epel_repo_dir_confluent
[sms](*\#*) (*\install*) dnf-plugins-core createrepo
# Download required EPEL packages
[sms](*\#*) dnf download --destdir $epel_repo_dir_confluent fping libconfuse libunwind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems strange, why don't we just enable EPEL on the compute nodes?

@adrianreber
Copy link
Member

Hi, Adrian l don't know what state the management server and cluster are in. But usually, the error that l seeing happens when the automation SSH key is missing from the /etc/confluent/ssh directory. This key should have been created during the osdeploy initialize step. The input.local file should have an initialize_options variable with the value usklpta where the a option creates the key in question.

I just copied usklpt without the a. Retrying with the additional a now.

@adrianreber
Copy link
Member

Now the compute nodes are provisioned, but I cannot login:

# confluent_selfcheck -n c1
OS Deployment: Initialized
Confluent UUID: Consistent
Web Server: Running
Web Certificate: OK
Checking web download: Failed to download /confluent-public/site/confluent_uuid
Checking web API access: Failed access, if selinux is enabled, `setsebool -P httpd_can_network_connect=1`, otherwise check web proxy configuration
TFTP Status: OK
SSH root user public key: OK
Checking SSH Certificate authority: OK
Checking confluent SSH automation key: OK
Checking for blocked insecure boot: OK
Checking IPv6 enablement: OK
Performing node checks for 'c1'
Checking node attributes in confluent...
Checking network configuration for c1
c1 appears to have network configuration suitable for IPv4 deployment via: ens2f0
No issues detected with attributes of c1
Checking name resolution: OK

Using warewulf 3 provisioning the ssh keys from /root/.ssh are automatically part of the compute nodes and ssh works. Can confluent also use one of those existing keys and add it to the compute node?

Also, the current recipe does not wait until the compute nodes are provisioned. It immediately continues and all commands like nodeshell fail, because the provisioning is not finished.

@adrianreber
Copy link
Member

Ah, so the problem is, is that I have SSH keys in different formats and the last in the list is using an unsupported algorithm.

In /opt/confluent/lib/python/confluent/sshutil.py all SSH keys are copied to the provisioning image, but instead of overwriting the previous key it would probably make more sense to append all keys.

Following code change seems to work for me:

--- /opt/confluent/lib/python/confluent/sshutil.py	2023-11-15 16:30:46.000000000 +0000
+++ /opt/confluent/lib/python/confluent/sshutil.py.new	2024-08-12 09:10:48.601474767 +0000
@@ -214,10 +214,14 @@
     else:
         suffix = 'rootpubkey'
     for auth in authorized:
-        shutil.copy(
-            auth,
+        local_key = open(auth, 'r')
+        dest = open(
             '/var/lib/confluent/public/site/ssh/{0}.{1}'.format(
-                    myname, suffix))
+                    myname, suffix), 'a')
+        dest.write(local_key.read())
+    if os.path.exists(
+            '/var/lib/confluent/public/site/ssh/{0}.{1}'.format(
+                myname, suffix)):
         os.chmod('/var/lib/confluent/public/site/ssh/{0}.{1}'.format(
                 myname, suffix), 0o644)
         os.chown('/var/lib/confluent/public/site/ssh/{0}.{1}'.format(

Instead of copying all the files and overwriting everything with the last file, this appends all public keys.

@adrianreber
Copy link
Member

Now SSH works, but provisioning fails again:

Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 612, in resourcehandler
    for rsp in resourcehandler_backend(env, start_response):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 635, in resourcehandler_backend
    for res in selfservice.handle_request(env, start_response):
  File "/opt/confluent/lib/python/confluent/selfservice.py", line 526, in handle_request
    status, output = syncfiles.get_syncresult(nodename)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 356, in get_syncresult
    result = syncrunners[nodename].wait()
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 181, in wait
    return self._exit_event.wait()
  File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 132, in wait
    current.throw(*self._exc)
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 221, in main
    result = function(*args, **kwargs)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 215, in sync_list_to_node
    raise Exception("Syncing failed due to unreadable files: " + ','.join(unreadablefiles))
Exception: Syncing failed due to unreadable files: /tmp/tmp9t4o6x20/etc/shadow
Aug 12 10:48:09 Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 197, in sync_list_to_node
    output, stderr = util.run(
  File "/opt/confluent/lib/python/confluent/util.py", line 48, in run
    raise subprocess.CalledProcessError(retcode, process.args, output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['rsync', '-rvLD', '/tmp/tmpszubn5dq.synctoc2/', 'root@[10.241.58.133]:/']' returned non-zero exit status 23.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 612, in resourcehandler
    for rsp in resourcehandler_backend(env, start_response):
  File "/opt/confluent/lib/python/confluent/httpapi.py", line 635, in resourcehandler_backend
    for res in selfservice.handle_request(env, start_response):
  File "/opt/confluent/lib/python/confluent/selfservice.py", line 526, in handle_request
    status, output = syncfiles.get_syncresult(nodename)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 356, in get_syncresult
    result = syncrunners[nodename].wait()
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 181, in wait
    return self._exit_event.wait()
  File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 132, in wait
    current.throw(*self._exc)
  File "/usr/lib/python3.9/site-packages/eventlet/greenthread.py", line 221, in main
    result = function(*args, **kwargs)
  File "/opt/confluent/lib/python/confluent/syncfiles.py", line 215, in sync_list_to_node
    raise Exception("Syncing failed due to unreadable files: " + ','.join(unreadablefiles))
Exception: Syncing failed due to unreadable files: /tmp/tmpw077yd0x/etc/shadow

It makes kind of sense, because /tmp/tmpw077yd0x/etc/shadow is indeed 000 but I am not sure what is going on, running the same rsync command as root works without errors.

Currently I am again stuck in provisioning:

# nodedeploy compute
c1: pending: rocky-9.4-x86_64-default
c2: pending: rocky-9.4-x86_64-default
# confluent_selfcheck -n c1
OS Deployment: Initialized
Confluent UUID: Consistent
Web Server: Running
Web Certificate: OK
Checking web download: Failed to download /confluent-public/site/confluent_uuid
Checking web API access: Failed access, if selinux is enabled, `setsebool -P httpd_can_network_connect=1`, otherwise check web proxy configuration
TFTP Status: OK
SSH root user public key: OK
Checking SSH Certificate authority: OK
Checking confluent SSH automation key: OK
Checking for blocked insecure boot: OK
Checking IPv6 enablement: OK
Performing node checks for 'c1'
Checking node attributes in confluent...
Checking network configuration for c1
c1 appears to have network configuration suitable for IPv4 deployment via: ens2f0
No issues detected with attributes of c1
Checking name resolution: OK

@jjohnson42
Copy link

Following code change seems to work for me:

Pull request is welcome for that one. It has come up but we didn't quite get around to appending keys when dealing with multiple /root/.ssh/*.pub keys. https://github.com/xcat2/confluent/pulls

@jjohnson42
Copy link

on the /etc/shadow issue, this is a consequence of confluent not being allowed to run as root, so for files like /etc/shadow, if that is desired, then you would need one readable by the confluent user. We frequently support doing /etc/passwd and 'stubbing out' shadow to be password disabled for accounts like that as an option.

@adrianreber
Copy link
Member

on the /etc/shadow issue, this is a consequence of confluent not being allowed to run as root, so for files like /etc/shadow, if that is desired, then you would need one readable by the confluent user. We frequently support doing /etc/passwd and 'stubbing out' shadow to be password disabled for accounts like that as an option.

How could this be best automated in a recipe like we are trying to build here? Any recommendations?

@jjohnson42
Copy link

I'd probably offer some example choices:
-Use 'Merge' support of /etc/passwd, do not include shadow. This will produce 'password disabled' instances of the users from passwd, for ssh key based access only
-Give confluent read access to /etc/shadow
-Make a blessed /etc/shadow copy for confluent to distribute
-Use a separate mechanism or invocation to push out /etc/shadow (e.g. nodersync manually run as the root user can do it).

I think we were imagining the first option, that sync targets aren't interested in the passwords.

Note that root password is a node attribute and can be set in the confluent db. The default is to disable root password unless specified. If set during deploy, it will get that root password into shadow (though before syncfiles run).

@adrianreber
Copy link
Member

Following code change seems to work for me:

Pull request is welcome for that one. It has come up but we didn't quite get around to appending keys when dealing with multiple /root/.ssh/*.pub keys. https://github.com/xcat2/confluent/pulls

xcat2/confluent#159

@adrianreber
Copy link
Member

I'd probably offer some example choices:

  • Use 'Merge' support of /etc/passwd, do not include shadow. This will produce 'password disabled' instances of the users from passwd, for ssh key based access only
  • Give confluent read access to /etc/shadow
  • Make a blessed /etc/shadow copy for confluent to distribute
  • Use a separate mechanism or invocation to push out /etc/shadow (e.g. nodersync manually run as the root user can do it).

I think we were imagining the first option, that sync targets aren't interested in the passwords.

Note that root password is a node attribute and can be set in the confluent db. The default is to disable root password unless specified. If set during deploy, it will get that root password into shadow (though before syncfiles run).

As this recipe is contributed by you (upstream confluent) I would let you decide how to design and implement it. Also with the proper warnings in the documentation. But whatever makes most sense for you. If the recipe results in a working cluster we are happy to include it. Maybe merge support makes sense as we do not use passwords anyway much (at all) or the blessed copy. I would defer this to you and your experience what makes most sense.

@adrianreber
Copy link
Member

With a chmod 644 /etc/shadow I have a workaround. We should still have a proper solution in the recipe to handle /etc/shadow.

Following things needs to be fixed at this point:

  • the recipe needs to wait until the compute nodes are ready
  • epel-release needs to be installed on the compute nodes
  • ohpc-release needs to be installed on the compute nodes

For warewulf we do:

export CHROOT=/opt/ohpc/admin/images/rocky9.3
wwmkchroot -v rocky-9 $CHROOT
dnf -y --installroot $CHROOT install epel-release
cp -p /etc/yum.repos.d/OpenHPC*.repo $CHROOT/etc/yum.repos.d

As confluent first does the installation and then changes the running compute node, this approach will not work.
For Rocky and AlmaLinux something like this will work:

# nodeshell compute dnf -y  install epel-release
# nodeshell compute dnf -y  install http://repos.openhpc.community/OpenHPC/3/EL_9/x86_64/ohpc-release-3-1.el9.x86_64.rpm

The following commands are unnecessary or do not work:

# nodeshell compute dnf -y  install ntp
# nodeshell compute dnf -y  install  --enablerepo=powertools lmod-ohpc #powertools does not exist, it is called crb and already enabled earlier
# nodeshell compute systemctl restart nfs
c1: Failed to restart nfs.service: Unit nfs.service not found.
c2: Failed to restart nfs.service: Unit nfs.service not found.

This is needed: nodeshell compute dnf -y install nfs-utils

The existing /etc/hosts from the SMS is not synced to the compute nodes.

Besides the items mentioned here we seem to be able to get a cluster with two compute nodes running.

The nice thing for OpenHPC is that with this recipe we would finally have a stateful provisioned recipe again.

When we used to have a XCAT stateful recipe, it was explicitly marked to be stateful, not sure how you want to do this. Do you want to have one recipe which can either do stateful or stateless? Or two recipes?

@jjohnson42
Copy link

So if I'm understanding
Need to wait for nodedeploy to show:

 # nodedeploy r3u23
r3u23: completed: alma-9.4-x86_64-default

Changes to syncfiles to include:
/etc/hosts
/etc/yum.repos.d/OpenHPC*.repo $CHROOT/etc/yum.repos.d

And in post.d, to install epel-release

For nfs-utils, we could add it to the pkglist, or add a 'dnf -y install nfs-utils' as a 'post.d' script.

For diskless, maybe a different recipe. It will be more 'warewulf' like, with 'imgutil build' and 'imgutil exec'. There's also been a suggestion to make the 'installimage' script work for those instead of just clones.

@adrianreber
Copy link
Member

/etc/yum.repos.d/OpenHPC*.repo $CHROOT/etc/yum.repos.d

Either install the repo file, but this requires to also copy the keys, or install the ohpc-release RPM via dnf.

@jjohnson42
Copy link

@adrianreber To go back, did you want to do a pull request for the ssh key handling change, or did you want it done on your behalf? I kind of like the idea of the pull request to keep it clear who did what, but can just work it from your comment if preferred.

@adrianreber
Copy link
Member

@adrianreber To go back, did you want to do a pull request for the ssh key handling change, or did you want it done on your behalf? I kind of like the idea of the pull request to keep it clear who did what, but can just work it from your comment if preferred.

I already did at xcat2/confluent#159

@jjohnson42
Copy link

Thanks, sorry for not noticing sooner. I accepted and amended it just a tad (to empty out the file before writing, and using 'with' to manage open/close of the files.

@jjohnson42
Copy link

@adrianreber FYI, confluent 3.11.0 has been released including your change for ssh pubkey handling.

@tkucherera-lenovo
Copy link
Contributor Author

@adrianreber Since the compute nodes are provisioned without internet access running commands like nodeshell compute dnf -y install http://repos.openhpc.community/OpenHPC/3/EL_9/x86_64/ohpc-release-3-1.el9.x86_64.rpm would fail. Do you advise we set up a NAT gateway on the master node to give the computes access to the internet, or we follow what xcat recipe was doing which is locally setting up the ohpc repo copy and then configuring the repo which can be accessed by the computes via the web roots xcat would have set up see here:

# Add OpenHPC repo mirror hosted on SMS
[sms](*\#*) psh compute dnf config-manager --add-repo=http://$sms_ip/$ohpc_repo_dir/OpenHPC.local.repo
# Replace local path with SMS URL
[sms](*\#*) psh compute "perl -pi -e 's/file:\/\/\@PATH\@/http:\/\/$sms_ip\/"${ohpc_repo_dir//\//"\/"}"/s' \
        /etc/yum.repos.d/OpenHPC.local.repo"

@adrianreber
Copy link
Member

@adrianreber Since the compute nodes are provisioned without internet access running commands like nodeshell compute dnf -y install http://repos.openhpc.community/OpenHPC/3/EL_9/x86_64/ohpc-release-3-1.el9.x86_64.rpm would fail. Do you advise we set up a NAT gateway on the master node to give the computes access to the internet, or we follow what xcat recipe was doing which is locally setting up the ohpc repo copy and then configuring the repo which can be accessed by the computes via the web roots xcat would have set up see here:

# Add OpenHPC repo mirror hosted on SMS
[sms](*\#*) psh compute dnf config-manager --add-repo=http://$sms_ip/$ohpc_repo_dir/OpenHPC.local.repo
# Replace local path with SMS URL
[sms](*\#*) psh compute "perl -pi -e 's/file:\/\/\@PATH\@/http:\/\/$sms_ip\/"${ohpc_repo_dir//\//"\/"}"/s' \
        /etc/yum.repos.d/OpenHPC.local.repo"

Hmm, I see. In our test setup all nodes have internet access that is why I didn't really think about it.

I would say we mention that the nodes need internet access for all the steps somewhere in the documentation and leave it to the user to configure NAT or a proxy or whatever. That would be the easiest solution and would be acceptable for me. As we do not talk about network setup or network securing the nodes or the head node it sounds acceptable for me.

What do you think?

For our testing we actually set up a proxy server to reduce re-downloading of RPMs, so even with internet access we already change the network setup slightly.

@tkucherera-lenovo
Copy link
Contributor Author

Having the nodes set up to access the internet also works for me.

@tkucherera-lenovo tkucherera-lenovo force-pushed the confluent_slurm branch 3 times, most recently from b48b5a8 to 10fe611 Compare September 18, 2024 14:42
@tkucherera-lenovo
Copy link
Contributor Author

@adrianreber l have made some changes to include the mentioned discussions

  1. including adding epel-release and ohpc repo to nodes
  2. installing nfs-utils on the computes
  3. syncing /etc/hosts
  4. fix documentation bugs

Note: The error you were getting with nfs.service not found could be that NFS is not installed on the master node. According to section 1.2 of the ohpc install guide, NFS is hosted on the master node, but I do not see in the guides, warewolf or xcat, where it is installed. Is it assumed that it is already installed? Please advise.

@adrianreber
Copy link
Member

So with the latest changes I am to run a full test suite with no errors. I still have to do some minor changes.

The following changes are currently still necessary:

  • dns_servers needs to be set but it does not seem to be part of docs/recipes/install/rocky9/input.local.template. Can you add it to make sure users are setting it.
  • dns_domain also needs to be set. There is already a variable from previous xcat recipes called domain_name. Can this be reused?
  • The assumption seems to be that all traffic going through the SMS: net.ipv4_gateway=${sms_ip}. Can confluent automatically pickup the default gateway of the SMS during provisioning? Or how can we set the default gateway without depending on the SMS IP. Maybe introduce a new variable?
  • remove --enablerepo=powertools. The name has changed to crb. What I am doing currently is: sed -e "s,epel-release,epel-release; /usr/bin/crb enable,g" -i "${recipeFile}". This way each time the epel-release package is installed the CRB repository is enabled. Please use /usr/bin/crb enable. That seems to be the recommended way of doing it.
  • We switched from gnu13 to gnu14. Please update the recipe to install the gnu14 variant of all packages. Maybe this will be solved by rebasing your PR.
  • Not a change for this PR, but the way /etc/profile.d/confluent_env.sh extends the MANPATH always adds additional : at the end of $MANPATH is empty. This breaks one of our tests, but nothing you really need to change.
  • warewulf adds lines to /etc/hosts which our testing expects. I see there is confluent2hosts but I was not able to get it running. Basically an entry for each compute node in the format 10.1.5.132 c1 c1.local would be nice to have. What is the right way to call confluent2hosts? This is a step that could be included in the recipe.
  • Is there a way to automatically have the update repository active during compute node installation? That way I could remove one additional reboot from our test scripts. Because currently what happens is that the compute node is installed and during the recipe you run dnf -y update on all nodes but the new packages are not active. If this could be part of the installation that would be helpful.

@adrianreber
Copy link
Member

Oh, and please squash your commits. For a new feature like this is would make sense to have it all in one commit without fixup commits.

@@ -203,7 +203,7 @@ \subsubsection{Add \OHPC{} components} \label{sec:add_components}
[sms](*\#*) (*\chrootinstall*) kernel

# Include modules user environment
[sms](*\#*) (*\chrootinstall*) --enablerepo=powertools lmod-ohpc
[sms](*\#*) (*\chrootinstall*) /usr/bin/crb enable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without trying it, this is now missing the installation of lmod-ohpc.

@@ -156,7 +156,7 @@ \subsection{Enable \OHPC{} repository for local use} \label{sec:enable_repo}

% begin_ohpc_run
\begin{lstlisting}[language=bash,keywords={},basicstyle=\fontencoding{T1}\fontsize{8.0}{10}\ttfamily,literate={ARCH}{\arch{}}1 {-}{-}1]
[sms](*\#*) (*\install*) epel-release
[sms](*\#*) (*\install*) epel-release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommendation when installing epel-release it to do /usr/bin/crb enable as a second command. My recommendation would be to install epel-release on the SMS and on the compute nodes as well as run /usr/bin/crb enable on the SMS and compute nodes.

@adrianreber
Copy link
Member

Looks like something with the Latex content is broken. You probably have to escape underscores like ipv4_address and ipv6_address.

@adrianreber
Copy link
Member

Can you do another squash and avoid the merge commit. Something like:

$ git pull --rebase

and then do the squashing? I will use this for one more test run, but it should be really close to be ready and smaller fixups can also be done later.

@tkucherera-lenovo
Copy link
Contributor Author

@adrianreber you want me to squash those commits including the merge commit and have just one commit?

@adrianreber
Copy link
Member

Yes, just a single commit and no merge commits.

@@ -31,12 +34,25 @@ bmc_password="${bmc_password:-unknown}"
# Additional time to wait for compute nodes to provision (seconds)
provision_wait="${provision_wait:-180}"

# Local domainname for cluster (xCAT recipe only)
# DNS Local domainname for cluster (xCAT and Confluent recipe only)
dns_servers="${dns_sersers:-172.30.0.254}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a typo. It says "sersers". Please fix.

@@ -21,6 +21,9 @@ sms_eth_internal="${sms_eth_internal:-eth1}"
# Subnet netmask for internal cluster network
internal_netmask="${internal_netmask:-255.255.0.0}"

# ipv4 gateway
ipv4_gateway="${ipv4_gateway:-172.16.0.2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing " missing.

@adrianreber
Copy link
Member

Sorry for being pedantic, but could you also rework the commit message. Currently it is the result of the squash. Just make it from a single commit. The more information the better, but not what it is now. It has multiple "Signed-off-by" and some fixup information.

@adrianreber
Copy link
Member

So, another test shows that beside the mentioned typo, the missing " and the commit message this is ready.

Recipe to support using Confluent as a system manager and provisioner.
When setting up an ohpc cluster.

Signed-off-by: tkucherera <tkucherera@lenovo.com>
@tkucherera-lenovo
Copy link
Contributor Author

@adrianreber made the change and added a much more descriptive commit message. thanks.

@adrianreber
Copy link
Member

Thank you so much for working with us. I will wait for CI to do a last check, but then I will merge it.

@adrianreber adrianreber merged commit 7db68dd into openhpc:3.x Oct 17, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants