Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Implement support for fetchone() in the ODBCHook and the Databricks SQL Hook #36161

Merged

Conversation

Joffreybvn
Copy link
Contributor

This PR implements support for fetchone() in the ODBCHook and the Databricks SQL Hook. Fixing the bug described in #32319 (comment). It also refactors and improves a bit the docstring.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member

potiuk commented Dec 11, 2023

The static check handled in #36162

@potiuk potiuk merged commit 36010f6 into apache:main Dec 11, 2023
49 of 50 checks passed
"""Transform the databricks Row objects into a JSON-serializable list of rows."""
if result is not None:
"""Transform the databricks Row objects into JSON-serializable lists."""
if isinstance(result, list):
Copy link
Contributor

@utkarsharma2 utkarsharma2 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statement = f"DESCRIBE TABLE {table.name};"
hook.run(statement, parameters=parameters, handler=lambda x: x.fetchall())

with this code hook.run() will return [[val1, val2, val3],[val1, val2, val3], [val1, val2, val3]] instead of [row(), row(), row()] isn't this a breaking change??

Copy link
Contributor

@utkarsharma2 utkarsharma2 Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible solution is to inherit from row class and introduce deserialize and serialize method that way we can deal with this in a similar way as dataframes.

@potiuk
Copy link
Member

potiuk commented Dec 13, 2023

@Joffreybvn -> what's your take ?

@Joffreybvn
Copy link
Contributor Author

Joffreybvn commented Dec 13, 2023

Just to clarify, the support for fetchall - with the breaking change - was implemented in #32319 .

But yes, it is a breaking change. For ODBC, the Row became a NamedTuple - which is a very 'soft' breaking change. For Databricks, there is no NamedTuple because the 'make_serializable' method from #31780, which was turning Row into tuples already, was simply moved down from the Operator into the Hook.

Thus, I can make it a NamedTuple too. Adding serialize and deserialize was also discussed in the first PR, and led to this ADR. If Databricks wants to comply with it, it should return a tuple / namedtuple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants