Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for EksCreateClusterOperator deferrable mode #36079

Conversation

syedahsn
Copy link
Contributor

@syedahsn syedahsn commented Dec 6, 2023

This PR addresses the issues that were discovered out in the EksCreateClusterOperator deferrable mode. There are several changes made.

  1. EksCreateClusterTrigger did not set a status as failed, but the operator expected a failed status if the cluster creation did not go as expected. I've added a new run method that yields a failed status on failure.
  2. In deferrable_create_cluster_next, there was a bug where we were assuming the compute type could only be fargate or nodegroup. This is not true; the compute type can be None as well. This is fixed by adding an explicit check.
  3. Fixed a typo in the operator, and raised an exception on cluster creation failure
  4. In the EksDeleteClusterTrigger, there was a bug where the cluster would only get deleted if force_delete_compute was True.
    closes: Required to fix error handling in EksCreateClusterOperator in deferrable mode #34844
    related: Fix error handling in EksCreateClusterOperator in deferrable mode #34966

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Dec 6, 2023
Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just small NIT/suggestion

@@ -253,6 +253,7 @@ def __init__(
self.fargate_selectors = fargate_selectors or [{"namespace": DEFAULT_NAMESPACE_NAME}]
self.fargate_profile_name = fargate_profile_name
self.deferrable = deferrable
self.eks_hook = EksHook(aws_conn_id=self.aws_conn_id, region_name=self.region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestion, maybe we could use hook property as we use in other boto3 related hooks, so it become more consistent to other hooks?

E.g.

@cached_property
def hook(self) -> EksHook:
    return EksHook(aws_conn_id=self.aws_conn_id, region_name=self.region)

@property
def eks_hook(self):
    warning.warn(
         "`eks_hook` property is deprecated and will be removed in the future. "
         "Please use `hook` property instead.",
        AirflowProviderDeprecationWarning
        stacklevel=2,
    )
    return self.eks_hook

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

self.log.info("Cluster deleted")
raise event["exception"]
raise AirflowException("Error creating cluster")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise AirflowException("Error creating cluster")
raise AirflowException("Error creating cluster")

Small suggestion to move raise an error out of the condition block, I guess execute_failed method always expected to raise an error, so just in case raise an error even if status not None or not "deleted", which I guess not possible right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. It shouldn't ever be the case that we don't get a deleted status, but there's no reason not to move the exception out of the block

…ow waiting for cluster to be deleted if the delete has already been initiated.
@vincbeck
Copy link
Contributor

vincbeck commented Dec 7, 2023

Static check failure

@syedahsn
Copy link
Contributor Author

syedahsn commented Dec 7, 2023

Sorry missed that one. Everything is passing now

@vincbeck vincbeck merged commit 7329e9e into apache:main Dec 7, 2023
50 checks passed
@vincbeck vincbeck deleted the syedahsn/ekscreateclusteroperator-deferrable-fix branch December 7, 2023 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Required to fix error handling in EksCreateClusterOperator in deferrable mode
3 participants