fix: Retroactively add granularity param to charts #12960

etr2460 · 2021-02-05T01:48:23Z

SUMMARY

We ran into some data accuracy issues in our environment where time range filters wouldn't apply to certain charts (like Big Number) when set on a dashboard. We determined that this was because these charts somehow didn't have a granularity param set on them, so the time range wasn't appropriately set on the chart query. This was especially dangerous, because the chart showed as applying the filter, but it wasn't actually applied to the query.

The fix is this db migration that adds the granularity param onto charts that should have one but don't.

TEST PLAN

Before the migration, see big number charts without the granularity or granularity_sqla param not filter by time range correctly.

Run the migration, and see the dashboard perform the filter appropriately.

ADDITIONAL INFORMATION

to: @john-bodley @ktmud @graceguo-supercat @villebro
cc: @junlincc

john-bodley · 2021-02-05T02:05:34Z

superset/migrations/versions/070c043f2fdb_add_granularity_to_charts_where_missing.py

+    - Find all charts without a granularity or granularity_sqla param.
+    - Get the dataset that backs the chart.
+    - If the dataset has the main dttm column set, use it.
+    - Otherwise, find all the dttm columns in the dataset and use the first one.


Should this mention that this mimics the behavior of the frontend?

good call, done

john-bodley · 2021-02-05T02:06:58Z

superset/migrations/versions/070c043f2fdb_add_granularity_to_charts_where_missing.py

+            if "granularity" in params or "granularity_sqla" in params:
+                continue
+
+            table = session.query(SqlaTable).get(slc.datasource_id)


Is this performant? I wonder if the join should be part of the slice query.

This ended up only altering 150 slices in our DB (and only got to this step for about 4k) so i'm not sure performance matters that much. It's a trade off between doing the join with a much larger number of slices (200k+) vs. waiting until we get to this step. idk

codecov-io · 2021-02-05T04:52:49Z

Codecov Report

Merging #12960 (30660ce) into master (9982fde) will decrease coverage by 2.43%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #12960      +/-   ##
==========================================
- Coverage   69.14%   66.70%   -2.44%     
==========================================
  Files        1025      491     -534     
  Lines       48767    28888   -19879     
  Branches     5188        0    -5188     
==========================================
- Hits        33718    19269   -14449     
+ Misses      14915     9619    -5296     
+ Partials      134        0     -134

Flag	Coverage Δ
cypress	`?`
javascript	`?`
python	`66.70% <0.00%> (-0.92%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...43f2fdb_add_granularity_to_charts_where_missing.py	`0.00% <0.00%> (ø)`
superset/sql_validators/postgres.py	`50.00% <0.00%> (-50.00%)`	⬇️
superset/views/database/views.py	`62.69% <0.00%> (-24.88%)`	⬇️
superset/dataframe.py	`91.66% <0.00%> (-8.34%)`	⬇️
superset/databases/commands/create.py	`83.67% <0.00%> (-8.17%)`	⬇️
superset/databases/commands/update.py	`85.71% <0.00%> (-8.17%)`	⬇️
superset/sql_validators/base.py	`93.33% <0.00%> (-6.67%)`	⬇️
superset/db_engine_specs/sqlite.py	`90.62% <0.00%> (-6.25%)`	⬇️
superset/db_engine_specs/base.py	`79.85% <0.00%> (-6.15%)`	⬇️
superset/db_engine_specs/presto.py	`82.25% <0.00%> (-5.63%)`	⬇️
... and 577 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9982fde...30660ce. Read the comment docs.

villebro · 2021-02-05T07:23:38Z

@etr2460 I noticed a similar weirdness when I was fixing a regression in the table chart. For some reason the chart looked different in the dashboard compared to the Explore view. When looking at the metadata I noticed that the chart was missing a control value that had been added to the control panel after the chart had been created. Upon closer inspection it turned out Explore merges the chart metadata on top of the default control values, but dashboard doesn't. I didn't yet have time to look into this more closely, but I believe making sure the metadata flow is the same in Dashboard and Explore view might solve this problem, potentially making the migration unnecessary.

ktmud

I think this is a safe migration if the query can be tuned to be more performant. Unifying form_data and control defaults merging logics between Dashboard and Explore might take a lot more time than relatively straightforward migration so I'll vote +1 on moving this forward.

ktmud · 2021-02-05T09:56:19Z

superset/migrations/versions/070c043f2fdb_add_granularity_to_charts_where_missing.py

+
+    slices_changed = 0
+
+    for slc in session.query(Slice).filter(Slice.datasource_type == "table").all():


This query will fetch all chart slices with SQLA datasource and run JSON parse in Python. Could we filter out only those didn't have granularity or granularity_sqla instead?

Suggested change

for slc in session.query(Slice).filter(Slice.datasource_type == "table").all():

for slc in (

session.query(Slice)

.filter(and_(

Slice.datasource_type == "table",

not Slice.params.like('%"granularity%'))

))

.yield_per(500)

):

Not that matters in practicality, but I'd also try to stream things whenever I try to fetch an unknown number of all results (.yield_per(500)).

so you're saying do a plain text filter first to remove most of the slices that aren't eligible, and only do the json parse on what's remaining? It looks janky, but it should work, will update

I think this makes sense - no point in pulling in slices that aren't applicable

Yeah, since we don't need complex regexp here so a simple text match should work.

villebro · 2021-02-08T10:22:19Z

@etr2460 I noticed a similar weirdness when I was fixing a regression in the table chart. For some reason the chart looked different in the dashboard compared to the Explore view. When looking at the metadata I noticed that the chart was missing a control value that had been added to the control panel after the chart had been created. Upon closer inspection it turned out Explore merges the chart metadata on top of the default control values, but dashboard doesn't. I didn't yet have time to look into this more closely, but I believe making sure the metadata flow is the same in Dashboard and Explore view might solve this problem, potentially making the migration unnecessary.

I looked into this, and it turns out default values are in fact applied on the Chart form data on the Dashboard similarly as on the Explore view. The control panel on the Table chart was just setting the value of the queryModel control (which had the default value null) while rendering the control panel based on other form data.

Going forward we should potentially make default values "smarter", by making it possible to introduce hooks that return defaults based on other context. In the case of queryMode in table model, it would be inferred from the other formData, and in the case of granularity, it could default to main_dttm_col in the dataset if missing/unset.

villebro

LGTM

ktmud · 2021-02-10T18:34:51Z

Close and reopen to trigger docker-build and python-lint (3.7):

villebro · 2021-02-11T15:27:07Z

FYI a migration was merged today, so I believe we need to update the downgrade revision id. #13052

etr2460 · 2021-02-11T16:27:49Z

good call, updating now. And I agree @villebro there's some weirdness going on with passing parameters the same way between dashboard and explore. This might not be a long term fix, but more so fixes weirdness that's been cropping up currently

* fix: Retroactively add granularity param to charts * Update down revision

* master: (30 commits) refactor(native-filters): decouple params from filter config modal (first phase) (apache#13021) fix(native-filters): set currentValue null when empty (apache#13000) Custom superset_config.py + secret envs (apache#13096) Update http error code from 400 to 403 (apache#13061) feat(native-filters): add storybook entry for select filter (apache#13005) feat(native-filters): Time native filter (apache#12992) Force pod restart on config changes (apache#13056) feat(cross-filters): add cross filters (apache#12662) fix(explore): Enable selecting an option not included in suggestions (apache#13029) Improves RTL configuration (apache#13079) Added a note about the ! prefix for breaking changes to CONTRIBUTING.md (apache#13083) chore: lock down npm to v6 (apache#13069) fix: API tests, make them possible to run independently again (apache#13076) fix: add config to disable dataset ownership on the old api (apache#13051) add required * indicator to message content/notif method (apache#12931) fix: Retroactively add granularity param to charts (apache#12960) fix(ci): multiline regex in change detection (apache#13075) feat(style): hide dashboard header by url parameter (apache#12918) fix(explore): pie chart label bugs (apache#13052) fix: Disabled state button transition time (apache#13008) ...

pull-request-size bot added the size/L label Feb 5, 2021

etr2460 added the risk:db-migration PRs that require a DB migration label Feb 5, 2021

john-bodley reviewed Feb 5, 2021

View reviewed changes

etr2460 force-pushed the erik-ritter--add-granularity-migration branch from 05b1ea9 to 12210ca Compare February 5, 2021 04:37

ktmud reviewed Feb 5, 2021

View reviewed changes

fix: Retroactively add granularity param to charts

9c6ba17

etr2460 force-pushed the erik-ritter--add-granularity-migration branch from 12210ca to 9c6ba17 Compare February 5, 2021 17:08

villebro approved these changes Feb 10, 2021

View reviewed changes

ktmud closed this Feb 10, 2021

ktmud reopened this Feb 10, 2021

Update down revision

30660ce

etr2460 merged commit a6df284 into apache:master Feb 11, 2021

etr2460 deleted the erik-ritter--add-granularity-migration branch February 11, 2021 21:52

amitmiran137 pushed a commit to nielsen-oss/superset that referenced this pull request Feb 14, 2021

fix: Retroactively add granularity param to charts (apache#12960)

c1316b4

* fix: Retroactively add granularity param to charts * Update down revision

etr2460 mentioned this pull request Feb 26, 2021

[SIP-59] Proposal for Database migration standards #13351

Closed

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Retroactively add granularity param to charts #12960

fix: Retroactively add granularity param to charts #12960

etr2460 commented Feb 5, 2021

john-bodley Feb 5, 2021

etr2460 Feb 5, 2021

john-bodley Feb 5, 2021

etr2460 Feb 5, 2021

codecov-io commented Feb 5, 2021 •

edited

Loading

villebro commented Feb 5, 2021

ktmud left a comment

ktmud Feb 5, 2021

etr2460 Feb 5, 2021

villebro Feb 5, 2021

ktmud Feb 5, 2021

villebro commented Feb 8, 2021

villebro left a comment

ktmud commented Feb 10, 2021

villebro commented Feb 11, 2021

etr2460 commented Feb 11, 2021


		slices_changed = 0

		for slc in session.query(Slice).filter(Slice.datasource_type == "table").all():

-    for slc in session.query(Slice).filter(Slice.datasource_type == "table").all():
+    for slc in (
+        session.query(Slice)
+            .filter(and_(
+                Slice.datasource_type == "table",
+                not Slice.params.like('%"granularity%'))
+             ))
+            .yield_per(500)
+    ):

fix: Retroactively add granularity param to charts #12960

fix: Retroactively add granularity param to charts #12960

Conversation

etr2460 commented Feb 5, 2021

SUMMARY

TEST PLAN

ADDITIONAL INFORMATION

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Feb 5, 2021 • edited Loading

Codecov Report

villebro commented Feb 5, 2021

ktmud left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

villebro commented Feb 8, 2021

villebro left a comment

Choose a reason for hiding this comment

ktmud commented Feb 10, 2021

villebro commented Feb 11, 2021

etr2460 commented Feb 11, 2021

codecov-io commented Feb 5, 2021 •

edited

Loading