Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support decimal, date, time, timestamp with time zone and timestamp python types in prepared statements and result sets #160

Merged
merged 1 commit into from
Mar 14, 2022

Conversation

mdesmet
Copy link
Contributor

@mdesmet mdesmet commented Feb 13, 2022

Fixes #150 , #32

@mdesmet
Copy link
Contributor Author

mdesmet commented Feb 14, 2022

As mentioned in #32, this change will impact the behavior on existing codebases. Should we provide a feature flag for this improved type detection?

@mdesmet
Copy link
Contributor Author

mdesmet commented Feb 15, 2022

Time with time zone (utc offset) doesn't really exist as a Python type. It's possible to have time with a timezone, but however without an actual moment in time, it's impossible to know the UTC offset. So I don't think this is something we can support for now.

@mdesmet mdesmet force-pushed the feature/improve_types branch 2 times, most recently from 67c104d to a02bc6d Compare February 17, 2022 20:22
tests/integration/test_dbapi_integration.py Outdated Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Outdated Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Outdated Show resolved Hide resolved
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces regression regarding range of values that can be represented and hence needs a toggle somewhere - maybe add an alternative Cursor implementation?

Left some other comments regarding tests.

tests/integration/test_dbapi_integration.py Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Outdated Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Outdated Show resolved Hide resolved
tests/integration/test_dbapi_integration.py Show resolved Hide resolved
@mdesmet mdesmet force-pushed the feature/improve_types branch 2 times, most recently from 6d3f47b to a804244 Compare February 27, 2022 17:05
@mdesmet
Copy link
Contributor Author

mdesmet commented Feb 27, 2022

To convert named timezones i have added pytz as dependency. As we support older python versions, it seems easier to just use pytz, which was already a test dependency. Maybe we should make this an optional dependency as it's only required for when the flag is activated. Please advice.

Starting from python 3.9 we could replace this with the standard zoneinfo (however seems still need an additional library on Windows: https://tzdata.readthedocs.io/en/latest/).

Copy link
Member

@hovaesco hovaesco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make this an optional dependency as it's only required for when the flag is activated.

Let's leave it as a required dependency since e.g I want to continue to use old python types and then switch to the new one.

setup.py Show resolved Hide resolved
trino/client.py Show resolved Hide resolved
trino/client.py Show resolved Hide resolved
@mdesmet mdesmet force-pushed the feature/improve_types branch 4 times, most recently from 48a19eb to bf99688 Compare February 28, 2022 16:30
@hashhar hashhar self-requested a review February 28, 2022 19:43
README.md Outdated Show resolved Hide resolved
trino/client.py Outdated
elif "timestamp" in raw_type:
return datetime.strptime(value, "%Y-%m-%d %H:%M:%S.%f")
elif "time with time zone" in raw_type:
return datetime.strptime(value, "%H:%M:%S.%f").time()
Copy link
Contributor

@john-bodley john-bodley Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this include timezone information?, i,.e., lines 537 and 539 are the same.

Note the dateutil.parser.parse method could be useful and replace the logic in lines 527–539, somewhat ignoring the specific raw_type, i.e., value should contain the timezone information if the raw_type is timestamp with time zone or time with time zone, i.e.,

from dateutil.parser import parse

if raw_type in ("time",  "time with time zone", "timestamp", "timestamp with time zone"):
    result = parse(value)
    
    if raw_type in ("time",  "time with time zone"):
        result = result.time().replace(tzinfo=result.tzinfo)
        
    return result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's indeed a gap in the testing as well. On the library: currently we use pytz, and the format we will get is predictable so maybe no need to swap out the library.

trino/client.py Outdated
elif "timestamp" in raw_type:
return datetime.strptime(value, "%Y-%m-%d %H:%M:%S.%f")
elif "time with time zone" in raw_type:
return datetime.strptime(value, "%H:%M:%S.%f").time()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this include timezone information?

Note the dateutil.parser.parse method could be useful and replace the logic in lines 527–539, somewhat ignoring the specific raw_type, i.e., value should contain the timezone information if the raw_type is timestamp with time zone or time with time zone, i.e.,

from dateutil.parser import parse

if raw_type in ("time",  "time with time zone", "timestamp", "timestamp with time zone"):
    result = parse(value)
    
    if raw_type in ("time",  "time with time zone"):
        result = result.time().replace(tzinfo=result.tzinfo)
        
    return result

trino/dbapi.py Show resolved Hide resolved
trino/dbapi.py Show resolved Hide resolved
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few remaining comments. Thanks for the work on this @mdesmet.

Looks almost good to go.

@hashhar hashhar requested review from ebyhr and removed request for findepi March 8, 2022 07:26
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % last batch of comments.

trino/client.py Outdated
Comment on lines 537 to 544
matches = re.match(r'^(.*)([\+\-])(\d{2}):(\d{2})$', value)
assert matches is not None
if matches.group(2) == '+':
tz = timedelta(hours=int(matches.group(3)), minutes=int(matches.group(4)))
else:
tz = -timedelta(hours=int(matches.group(3)), minutes=int(matches.group(4)))
return datetime.strptime(matches.group(1), "%H:%M:%S.%f").time().replace(tzinfo=timezone(tz))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nineinchnick / @hovaesco Can you please double check this? It looks correct to me - not sure of simpler alternatives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we could avoid checking if matches.group(2) == '+' and just add +/- and don't have a conditional here. But I don't have a strong opinion about either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It seems for values like 2022-01-01 01:01:01 05:30 we'd convert this to negative offset which seems wrong.

@mdesmet can you passthrough the sign here? Or explicitly compare to - as well. the default should be positive offset AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the default. Will make that change now.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % a few more comments @mdesmet

#160 (comment) and #160 (comment)

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Decimal data type returned as strings
6 participants