-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add 'safe' slug scheme #744
Conversation
- always produces valid values - uses names, etc. as-is, when valid - uses stripped name--hash when needed
return True | ||
|
||
|
||
def is_valid_object_name(s): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function ended up not being used because instead of multiple is_valid args, is_valid_default
is used, which validates the subset of object names and labels (same as object name with max length of 63 instead of 255)
Hi, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions an questions.
Hi, |
@minrk I looked at this in order to help nudge it towards merge, but I see that its a very breaking PR to enable this for an existing hub as for example if self.enable_user_namespaces:
self.namespace = self._expand_user_properties(self.user_namespace_template)
self.pod_name = self._expand_user_properties(self.pod_name_template)
self.dns_name = self.dns_name_template.format(namespace=self.namespace, name=self.pod_name)
self.secret_name = self._expand_user_properties(self.secret_name_template)
self.pvc_name = self._expand_user_properties(self.pvc_name_template)
if self.working_dir:
self.working_dir = self._expand_user_properties(self.working_dir) The changes to the labels are far less breaking as it won't affect kubespawner/z2jh/binderhub/grafana-dashboards etc - and only be breaking for external software considering the labels. I'm not landing in a concrete action point here, just summarizing this realization. |
@consideRatio yeah, I think that's exactly why I didn't finish this. I think the new scheme is fine and works better, but figuring out the transition is the hard part. |
I'm going to try to pick this one up with #835 / #836 in mind as the transition path:
The hard part is still going to be transitioning things like PVC, pod name, etc. We need to make sure that once established, they don't change even if the scheme changes. I think this is true already for pod name. Persisting chosen values in Spawner state should do this for us, but will need to think it through and make sure everything is where it's supposed to be, and everything is persisted that should be. Looking at it, unfortunately pvc name is not persisted, so that's going to be hard. We may have to keep both schemes alive and check at runtime:
It may turn out that the best we can do is a breaking change with a documented manual migration step. PVCs can be renamed through the somewhat frightening process of:
or use krew rename-pvc (which does the same thing, but has the benefit of being written by not us). |
for more information, see https://pre-commit.ci
- safe scheme by default - preserve old scheme if found - persist pvc_name and attempt to handle lack of persistence from before - add handle_legacy_names and use_legacy_labels configuration for legacy labels
Thank you for picking this back up, @minrk! |
Thank you for working this @minrk!! These points may already be known and considered, but i wanted to highlight it when considering the paths onwards just in case. A external resource complexity to be aware of A key complexity for any change to username expansion is external resources, such as just creating a volume mount on the pod with subPath including username. We can't know if the external nfs volume includes the old username expansion or not, so we can't determine if its ok to switch over or not etc. Kubespawner itself only use component label i think I think the only use of labels currently by kubespawner itself is to reduce the set of resources to look at by the reflector using the component label. A performance aspect, because then the actual resource names are looked up among that set. |
Thanks for the notes!
Close. I think there's one more place: when services are enabled, labels are also used to route service traffic to the pod, which also uses the full label set (I'm not sure that's the right choice, but it should at least include user and server names). Now that this mostly works, I think there are some problems in the switch to handling escape of whole strings at a time:
That said, we cannot guarantee correct strings if we don't handle the full slug. I think it's simply not feasible to have one I don't really know how to address this. I think it only makes sense to apply the 'safe' slug scheme to kubernetes resource names and labels, and not in other contexts. It usually doesn't make sense to apply the escaping to mounts or subdirectories, but sometimes it will. The rules, however, are completely different and not really knowable. Applying the new scheme only to resource names I'm going to give this a read through, and instead of changing what
|
Thank you for diligently working through this hard problem, @minrk. Let's bump a major version whenever we release this. |
no need to make legacy component labels optional
Getting closer, but still tricky stuff to think about:
The |
- add user_server to template namespace - add `safe_` prefixed fields to template namespace using new 'safe' slug scheme - add pod_name, namespace, pvc_name to template namespace
I think I'm narrowing it down to a scheme we can safely and comprehensibly switch to. I mainly need to hash out the transition process and write out some tests and docs for how that should work. In all cases, current behavior can be preserved unchanged explicitly by replacing In general, we have:
I've also removed the config to make the labels in #835 optional, which I don't think we need as covered by @yuvipanda's comments over there. I feel like I am getting a good handle on resource names (pod_name, pvc_name), but figuring out the right thing to do for work_dir is a good deal more complex. Part of the trouble is that resource names are a safer transition because we can check if a resource at the old name exists and use it. We can also use state persistence to keep things sticky across config changes. We can't check if a filesystem path exists in time to make a choice. Filesystems also tend to have very different, much more permissive, rules than k8s object names, but they can vary quite a bit! So what should we do for work_dir?
|
If it helps we can potentially force admins to make a deliberate decision when upgrading Z2JH. If we add a new Z2JH config value to the schema and deliberately set it to an invalid value in values.yaml an admin must set the config. It's inconvenient for anyone expecting a seamless upgrade, but the schema validation should happen before anything is deployed so it shouldn't break anything. |
- upgrade from 6.x after default slug scheme change preserves pvc attachment - old name check doesn't happen if loaded state from more kubespawner 7
I wouldn't call this "done" by any means, but think it's now at least "complete" in that I have a pretty good understanding of what's changed, what works on upgrade and what requires user intervention, it's documented more thoroughly than before, and the new upgrade logic is tested and explained. So I think this is ready for review! |
Amazing effort into this @minrk!!! I'm looking to help drive review effort and one of the key things I knew I wanted to look into was how this affected providing user storage via NFS, so this comment got dedicated to that as a first step. NFS storage considerationsWhen using NFS storage, one may have a single PVC configured with a Click to see example z2jh / kubespawner configThis is z2jh Helm chart config: jupyterhub:
singleuser:
storage:
type: static
static:
pvcName: home-nfs
subPath: "{username}" Via a z2jh provided jupyterhub_config.py, this leads to KubeSpawner config: c.KubeSpawner.volumes = [
{
"name": "home",
"persistentVolumeClaim": {
"claimName": "home-nfs",
},
},
],
c.KubeSpawner.volume_mounts = [
{
"mountPath": "/home/jovyan",
"name": "home",
"subPath": "{username}", # this is whats important
},
] In such situations, A quick but essential config change before upgrading to z2jh v4 / kubespawner v7 would be to do a change like this: jupyterhub:
singleuser:
storage:
type: static
static:
pvcName: home-nfs
- "subPath": "{username}"
+ "subPath": "{escaped_username}" For new hubs though, it would make sense to do Action points
|
Migration consideration notesIn case one wants to migrate some resource, for example NFS storage folder names, one may need to identify how Expand to read moreDetecting if
|
kubespawner/spawner.py
Outdated
handle_legacy_names = Bool( | ||
True, | ||
config=True, | ||
help="""handle legacy names and labels | ||
|
||
kubespawner 7 changed the scheme for computing names and labels to be more reliably valid. | ||
In order to preserve backward compatibility, the old names must be handled in some places. | ||
|
||
You can safely disable this if no PVCs were created or running servers were started | ||
before upgrading to kubespawner 7. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config handle_legacy_names
is only used to handle a transition to safe name expansions for PVCs, not for anything else like the leading help text indicates with handle legacy names and labels
.
I think updating this to describe its just about managing the PVC naming makes sense rather than trying to handle anything beyond the PVC naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I intended to have this for a catch-all config "check if the upgraded defaults might have detached something without user input," but PVCs were the only thing I found where there was something to be done. Maybe namespace should be handled as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespaces don't need to be handled because they were persisted in 6.x, so I think this is okay as-is.
I chatted with @minrk about this just now, here is some planning made explicit:
Notes fot he future:
|
since it is an attempt to 'remember' an old name from kubespawner 6 which never remembers
I determined that |
Naturally, tests have started to fail for unrelated reasons on the last commit. Looking into it. |
Fixed tests with #846 |
Thank you for amazing effort into this @minrk!! I'm on vacation for two weeks now, but hope to find time to contribute towards release of beta and testing anyhow |
closes #737
closes #498
implements scheme described here.
-2d
)name---hash
when neededKubeSpawner.slug_scheme = 'escape'
Potential issues to address: