Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chart: add POD_NAME env for leader election #2039

Merged
merged 1 commit into from
May 31, 2024

Conversation

Aakcht
Copy link
Contributor

@Aakcht Aakcht commented May 31, 2024

Purpose of this PR

PR #1983 made leader election resourcelock to work based on POD_NAME identity(see the changes in main.go file). However POD_NAME environment variable is not being passed to spark operator pod. This results in the error described in #1987.

Resolves #1987 .

Proposed changes:
This PR adds POD_NAME environment variable to spark operator pods when leader_election is required.

Change Category

Indicate the type of change by marking the applicable boxes:

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Checklist

Before submitting your PR, please review the following:

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

Signed-off-by: aakcht <aakcht@gmail.com>
@vara-bonthu
Copy link
Contributor

@Aakcht Could you confirm if you have tested and deployed the Helm chart with this PR? Please provide evidence here

@Aakcht
Copy link
Contributor Author

Aakcht commented May 31, 2024

Hi, @vara-bonthu ! Yes, I tested that with this change and with replicaCount set to 2 the problem from #1987 goes away and spark-operator deploys successfully.
Also, just in case, I tested that with replicaCount set to 1 nothing breaks ( since #1987 does not affect the deployment with replicaCount: 1) .

Please tell me if some other evidence is needed.

Copy link
Contributor

@yuchaoran2011 yuchaoran2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm. This is a good catch. Thanks for the fix

Copy link
Contributor

@yuchaoran2011 yuchaoran2011 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yuchaoran2011

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 3c75376 into kubeflow:master May 31, 2024
7 checks passed
@Aakcht Aakcht deleted the add_pod_name_env branch June 3, 2024 05:53
sigmarkarl pushed a commit to spotinst/spark-on-k8s-operator that referenced this pull request Aug 7, 2024
Signed-off-by: aakcht <aakcht@gmail.com>
jbhalodia-slack pushed a commit to jbhalodia-slack/spark-operator that referenced this pull request Oct 4, 2024
Signed-off-by: aakcht <aakcht@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] spark-operator:v1beta2-1.4.3-3.5.0 crashes on start
3 participants