Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(db_engine_specs): add a few missing time grains to Postgres spec #30325

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

sfirke
Copy link
Member

@sfirke sfirke commented Sep 18, 2024

SUMMARY

I added to the Postgres engine spec definitions for some of the time grains that are defined in the base spec but had not been added to Postgres yet.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

image

TESTING INSTRUCTIONS

In an ephemeral environment, confirm that the grains are there and that the generated queries are correct.

@dosubot dosubot bot added the data:connect:postgres Related to Postgres label Sep 18, 2024
Copy link

codecov bot commented Sep 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.89%. Comparing base (76d897e) to head (0bcedac).
Report is 760 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #30325       +/-   ##
===========================================
+ Coverage   60.48%   83.89%   +23.40%     
===========================================
  Files        1931      533     -1398     
  Lines       76236    38489    -37747     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    32289    -13825     
+ Misses      28017     6200    -21817     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.00% <ø> (-0.16%) ⬇️
javascript ?
mysql 76.83% <ø> (?)
postgres 76.93% <ø> (?)
presto 53.51% <ø> (-0.30%) ⬇️
python 83.89% <ø> (+20.40%) ⬆️
sqlite 76.38% <ø> (?)
unit 60.61% <ø> (+2.98%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIce! Would you be able to add some tests for the Postgres time grains to tests/unit_tests/db_engine_specs/test_postgres.py? You can pretty much copy this test from

@pytest.mark.parametrize(
"time_grain,expected_result",
[
("PT1S", "CAST(DATE_TRUNC('second', CAST(col AS TIMESTAMP)) AS TIMESTAMP)"),
(
"PT5M",
"CAST(ROUND(DATE_TRUNC('minute', CAST(col AS TIMESTAMP)), 300000) AS TIMESTAMP)",
),
("P1W", "CAST(DATE_TRUNC('week', CAST(col AS TIMESTAMP)) AS TIMESTAMP)"),
("P1M", "CAST(DATE_TRUNC('month', CAST(col AS TIMESTAMP)) AS TIMESTAMP)"),
("P3M", "CAST(DATE_TRUNC('quarter', CAST(col AS TIMESTAMP)) AS TIMESTAMP)"),
("P1Y", "CAST(DATE_TRUNC('year', CAST(col AS TIMESTAMP)) AS TIMESTAMP)"),
],
)
def test_timegrain_expressions(time_grain: str, expected_result: str) -> None:
"""
DB Eng Specs (pinot): Test time grain expressions
"""
from superset.db_engine_specs.pinot import PinotEngineSpec as spec
actual = str(
spec.get_timestamp_expr(col=column("col"), pdf=None, time_grain=time_grain)
)
assert actual == expected_result

@sfirke
Copy link
Member Author

sfirke commented Sep 19, 2024

@villebro I can mimic those tests for Postgres. Though I wonder about the rigor of comparing the code in my spec to pasting the same code in a test. What if instead/also I had it test the transformation of a specific timestamp? Like that rounding ... 00:15:01 to 1 minute intervals gives ... 00:15:00 etc. Unless that's already happening here and I'm missing it.

@villebro
Copy link
Member

villebro commented Sep 19, 2024

@villebro I can mimic those tests for Postgres. Though I wonder about the rigor of comparing the code in my spec to pasting the same code in a test. What if instead/also I had it test the transformation of a specific timestamp? Like that rounding ... 00:15:01 to 1 minute intervals gives ... 00:15:00 etc. Unless that's already happening here and I'm missing it.

So the reason why these tests are valuable, despite looking like pure duplication, is because a change might happen that causes these time grain definitions to no longer be used. For example, I once did a big refactor of the db engine specs (#7676), and that refactor caused the time grains to be removed by mistake from the Hive engine spec (this was later fixed in #10084). If we would have had tests like this in place, we would have caught that error during the refactor. I know it may look funny to duplicate that functional logic to the unit tests, but having them there makes it less likely to introduce regressions, and it also makes it safer to refactor the implementation details.

Also one thing to note here, is that the unit test is in fact testing the function BaseEngineSpec.get_timestamp_expr, which is different from the map _time_grain_expressions that currently contains the expression definitions. It is highly likely that we will, at some point, change how time grains are defined, without changing the method that's used to retrieve the actual expression. So while they may look highly duplicated, they're actually slightly different.

Regarding actual functional verification that the expression actually rounds down the timestamp based on the time grain, that is something that we can't unfortunately test here, as that logic resides in Postgres. But correct me if I'm misunderstanding what you're saying.

@sfirke
Copy link
Member Author

sfirke commented Sep 19, 2024

Thank you for that thorough explanation! I now see the point of these tests. And whoops I missed that we can't actually test the output of the time grain because that happens in Postgres, good call. Okay, I will add tests like these.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfirke sfirke merged commit ac66ae8 into apache:master Sep 20, 2024
38 of 39 checks passed
@sfirke sfirke deleted the postgres-time-grains branch September 20, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants