Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedshiftSQLHook does not work with iam=True #35805

Closed
1 of 2 tasks
scnerd opened this issue Nov 22, 2023 · 3 comments · Fixed by #35897
Closed
1 of 2 tasks

RedshiftSQLHook does not work with iam=True #35805

scnerd opened this issue Nov 22, 2023 · 3 comments · Fixed by #35897
Assignees
Labels
area:providers kind:feature Feature Requests provider:amazon-aws AWS/Amazon - related issues

Comments

@scnerd
Copy link

scnerd commented Nov 22, 2023

Apache Airflow version

2.7.3

What happened

When RedshiftSQLHook attempts to auto-fetch credentials when iam=True, it uses a cluster-specific approach to obtaining credentials, which fails for Redshift Serverless.

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/operators/sql.py", line 280, in execute
    output = hook.run(
             ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/hooks/sql.py", line 385, in run
    with closing(self.get_conn()) as conn:
                 ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 173, in get_conn
    conn_params = self._get_conn_params()
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 84, in _get_conn_params
    conn.login, conn.password, conn.port = self.get_iam_token(conn)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 115, in get_iam_token
    cluster_creds = redshift_client.get_cluster_credentials(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 535, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 980, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ClusterNotFoundFault: An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster *** not found.

What you think should happen instead

The operator should establish a connection to the serverless workgroup using IAM-obtained credentials using redshift_connector.

How to reproduce

Create a direct SQL connection to Redshift using IAM authentication, something like:

{"conn_type":"redshift","extra":"{\"db_user\":\"USER\",\"iam\":true,\"user\":\"USER\"}","host":"WORKGROUP_NAME.ACCOUNT.REGION.redshift-serverless.amazonaws.com","login":"USER","port":5439,"schema":"DATABASE"}

Then use this connection for any SQLExecuteQueryOperator. The crash should occur when establishing the connection.

Operating System

Docker, amazonlinux:2023 base

Versions of Apache Airflow Providers

This report applies to apache-airflow-providers-amazon==8.7.1, and the relevant code appears unchange in the master branch. The code I'm using worked for Airflow 2.5.2 and version 7.1.0 of the provider.

Deployment

Amazon (AWS) MWAA

Deployment details

Local MWAA runner

Anything else

The break seems to occur because the RedshiftSQLHook integrates the IAM -> credential conversion, which used to occur inside redshift_connector.connect. The logic is not as robust and assumes that the connection refers to a Redshift cluster rather than a serverless workgroup. It's not clear to me why this logic was pulled up and out of redshift_connector, but it seems like the easiest solution is just to let redshift_connector handle IAM authentication and not attempt to duplicate that logic in the airflow provider.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@scnerd scnerd added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Nov 22, 2023
Copy link

boring-cyborg bot commented Nov 22, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@Taragolis
Copy link
Contributor

Taragolis commented Nov 23, 2023

@scnerd would you change issue title to less broad and which could describe your problem in short


In addition I have a look boto3 documentation for RedshiftServerless.Client and I guess nothing could be done until same method as Redshift.Client.get_cluster_credentials would add to RedshiftServerless.Client

I was a blind, maybe RedshiftServerless.Client.get_credentials is analogue to Redshift.Client.get_cluster_credentials

@Taragolis Taragolis added kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues area:providers pending-response and removed kind:bug This is a clearly a bug area:core labels Nov 23, 2023
@eladkal eladkal changed the title apache-airflow-providers-amazon RedshiftSQLHook does not work with iam=True Nov 25, 2023
@hussein-awala hussein-awala self-assigned this Nov 27, 2023
@hussein-awala
Copy link
Member

@scnerd I just created #35897 to fix this issue; it would be great if you could test it (if you have a created cluster).

@hussein-awala hussein-awala added kind:feature Feature Requests and removed pending-response needs-triage label for new issues that we didn't triage yet kind:bug This is a clearly a bug labels Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:feature Feature Requests provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants