Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New helm3 chart problem with nodeSelector and tolerations #1085

Closed
jkleckner opened this issue Nov 25, 2020 · 5 comments
Closed

New helm3 chart problem with nodeSelector and tolerations #1085

jkleckner opened this issue Nov 25, 2020 · 5 comments

Comments

@jkleckner
Copy link
Contributor

I'm finding that nodeSelector and tolerations don't work declared in the executor spec.

The nodeSelector works if put directly in the spec and if the label is fully qualified with kubernetes.io/ as for example kubernetes.io/hostname: "docker-desktop" but tolerations don't work even with trying out various paths.

Originally posted by @jkleckner in #1061 (comment)

@jkleckner
Copy link
Contributor Author

I'm testing with docker-desktop with K8s Rev: v1.19.3 on my Mac.

From a clean k8s state in docker, run:
helm install spark-operator .../github.com/GoogleCloudPlatform/spark-on-k8s-operator/charts/spark-operator-chart --namespace spark-operator --set sparkJobNamespace=spark --set rbac.create=true --set enableWebhook=true --set enableMetrics=true

Then apply the attached yaml and observe that executor and driver pods have the node selector but not the tolerations.
Tried out combinations of putting kubernetes.io/, node.kubernetes.io/ or nothing in the key path.

Cc: @hagaibarel
spark-pi-spark-nodesel.txt

@hagaibarel
Copy link
Contributor

Hi @jkleckner, as far as I understand this isn't an issue with the chart itself, but while trying to create a SparkApplication. From looking at the attached .txt file, I think the issue is you're defining the tolerations and nodeSelector under .spec and not in the executor / driver sections.

Try the following:

spec:
# other sections here
  driver:
    nodeSelector:
      kubernetes.io/hostname: "docker-desktop"
    tolerations:
    - key: "node.kubernetes.io/spark-executor"
      operator: Equal
      value: "true"
      effect: NoSchedule
# more config below

And the same for the executor

@jkleckner
Copy link
Contributor Author

That was what I had for my original spark app spec that worked with the old chart:

spec:
...
  executor:
    instances: 5     # Number of pods
...
    labels:
      version: 2.4.7
      spark-executor: "true"
    nodeSelector:
      spark-executor-large: "true"
    tolerations:
    - key: "spark-executor"
      operator: Equal
      value: "true"
      effect: NoSchedule

I tried many many combinations and found that I could get the nodeSelector to work if elevated to be under the spec but couldn't get tolerations to propagate even with qualifying them with kubernetes.io/.

I also found that there is a need to not only install with --set enableWebhook=true but also with --set webhook.enable=true to get the webhook cause the webhook to actually install.

This is a regression from the previous behavior of the chart.

Anyway, I ended up reverting to the deprecated 0.8.6 chart and installing with that from a local copy.

I may try this again when there is documentation.

It might have been good to break this chart install into a "just copy the 0.8.6 chart into the repo" followed by a separate "upgrade this chart to make it better" with descriptions of behavior changes.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Copy link

github-actions bot commented Nov 3, 2024

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants