-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't use hugepage if UserNamespaceSupport is enabled #1380
Comments
would it be possible for you to grab the What kernel are you using? It needs to support idmapped mounts for tmpfs (kernel > 6.3) |
Hi @giuseppe , Sry for the late response, The container is not running at all, so I can't get the container id for it. The kernel I'm using is 6.5
More info about my setup
When I launch the same pod without the hostUsers (user namespace) feature, it works and here's the info:
But even in a running pod/container, I can't see the config.json file:
|
I can get the config.json in the following way, let me know if that's any helpful:
|
thanks, that is the default config.json file generated by crun so it doesn't help in this case. What is the underlying file system? The file system might not support idmapped mounts |
Sure, Is there anything specific I should check?
|
this is probably an issue in Kubernetes, and we should not be using idmapped mounts with the hugetlb mount. @rata have you even seen this issue before? |
@giuseppe no, I haven't seen this before. I've been playing locally, and it seems that huge pages use the hugetlbfs file-system, so we can't idmap that filesystem and the pod fails to start. It seems that simple. IMHO the path forward might be to document this in kube (although we way fs needs to be supported, it might not be clear that huge pages use a different fs), improve the errors on crun/runc and add support in Linux for idmap on hugetlbfs filesystems. @giuseppe what do you think? Here is a more detailed version of what I did to conclude what I've just said. I've created a repro locally, based on the pod here (with small adjustments as the image pull secrets didn't exist here and the like). You can configure huge pages as explained here: https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/. And start the pod without userns so you really check it works as expected. With userns it fails to start with this error (this is using containerd and runc from main, but something similar should happen with crio and crun):
When checking on the host, the source is a hugetlbfs:
I've captured the config.json also, although there isn't nothing super interesting, just the /huge bind-mount that it's fs is hugetlbfs:
|
Opened PRs to runc and crun to improve the errormsg they show. I think this is clear enough that doesn't need more docs on kube. The remaining thing would be to add support on hugetlbfs for idmap mounts. What do others think? |
SGTM |
thanks for the patch. I am closing the issue because there is not much more to do from the OCI runtime. idmapped mounts support must be added to the hugetlb file system in the kernel for this to work |
I'll propose giuseppe/linux@3592ce4 upstream to add idmapped mounts support to hugetlbfs |
@brauner is giuseppe/linux@3592ce4 something you could pull to your tree (assuming you are fine with it) or should I submit it to lkml + hugetlb maintainers? |
On Fri, Dec 29, 2023 at 01:58:40PM -0800, Giuseppe Scrivano wrote:
@brauner is giuseppe/linux@3592ce4 something you could pull to your tree (assuming you are fine with it) or should I submit it to lkml + hugetlb maintainers?
Yeah, I can take that!
|
@giuseppe thank you so much for this btw, I tried the same with older kernel 5.14.0-70.30.1.el9_0.x86_64 (with the older k8s feature gate UserNamespacesStatelessPodsSupport) and hugepages seem to be working fine. (the rest of the mounts however are not supported yet on this version. like cm, secrets)
|
@ikwork yes, older k8s releases don't require idmap mounts, that is why it worked. But moving forward we do, because there were some limitations and concerns from other SIGs, and we need them anyways for persistent volumes. |
thanks! I'll send you the patch by email |
Hi,
I'm using "hostUsers: false" in my pods, to use the feature UserNamespacesSupport
https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/
All other volumeMounts are working fine except hugepages.
I'm using crun version 1.9 (also tried version 1.12) with crio 1.28.1
My pod spec:
Error from kubelet
The text was updated successfully, but these errors were encountered: