-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Retroactively add granularity param to charts #12960
fix: Retroactively add granularity param to charts #12960
Conversation
- Find all charts without a granularity or granularity_sqla param. | ||
- Get the dataset that backs the chart. | ||
- If the dataset has the main dttm column set, use it. | ||
- Otherwise, find all the dttm columns in the dataset and use the first one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this mention that this mimics the behavior of the frontend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, done
if "granularity" in params or "granularity_sqla" in params: | ||
continue | ||
|
||
table = session.query(SqlaTable).get(slc.datasource_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this performant? I wonder if the join should be part of the slice query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ended up only altering 150 slices in our DB (and only got to this step for about 4k) so i'm not sure performance matters that much. It's a trade off between doing the join with a much larger number of slices (200k+) vs. waiting until we get to this step. idk
05b1ea9
to
12210ca
Compare
Codecov Report
@@ Coverage Diff @@
## master #12960 +/- ##
==========================================
- Coverage 69.14% 66.70% -2.44%
==========================================
Files 1025 491 -534
Lines 48767 28888 -19879
Branches 5188 0 -5188
==========================================
- Hits 33718 19269 -14449
+ Misses 14915 9619 -5296
+ Partials 134 0 -134
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@etr2460 I noticed a similar weirdness when I was fixing a regression in the table chart. For some reason the chart looked different in the dashboard compared to the Explore view. When looking at the metadata I noticed that the chart was missing a control value that had been added to the control panel after the chart had been created. Upon closer inspection it turned out Explore merges the chart metadata on top of the default control values, but dashboard doesn't. I didn't yet have time to look into this more closely, but I believe making sure the metadata flow is the same in Dashboard and Explore view might solve this problem, potentially making the migration unnecessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a safe migration if the query can be tuned to be more performant. Unifying form_data and control defaults merging logics between Dashboard and Explore might take a lot more time than relatively straightforward migration so I'll vote +1 on moving this forward.
|
||
slices_changed = 0 | ||
|
||
for slc in session.query(Slice).filter(Slice.datasource_type == "table").all(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query will fetch all chart slices with SQLA datasource and run JSON parse in Python. Could we filter out only those didn't have granularity
or granularity_sqla
instead?
for slc in session.query(Slice).filter(Slice.datasource_type == "table").all(): | |
for slc in ( | |
session.query(Slice) | |
.filter(and_( | |
Slice.datasource_type == "table", | |
not Slice.params.like('%"granularity%')) | |
)) | |
.yield_per(500) | |
): | |
Not that matters in practicality, but I'd also try to stream things whenever I try to fetch an unknown number of all results (.yield_per(500)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you're saying do a plain text filter first to remove most of the slices that aren't eligible, and only do the json parse on what's remaining? It looks janky, but it should work, will update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense - no point in pulling in slices that aren't applicable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, since we don't need complex regexp here so a simple text match should work.
12210ca
to
9c6ba17
Compare
I looked into this, and it turns out default values are in fact applied on the Chart form data on the Dashboard similarly as on the Explore view. The control panel on the Table chart was just setting the value of the Going forward we should potentially make default values "smarter", by making it possible to introduce hooks that return defaults based on other context. In the case of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
FYI a migration was merged today, so I believe we need to update the downgrade revision id. #13052 |
good call, updating now. And I agree @villebro there's some weirdness going on with passing parameters the same way between dashboard and explore. This might not be a long term fix, but more so fixes weirdness that's been cropping up currently |
* fix: Retroactively add granularity param to charts * Update down revision
* master: (30 commits) refactor(native-filters): decouple params from filter config modal (first phase) (apache#13021) fix(native-filters): set currentValue null when empty (apache#13000) Custom superset_config.py + secret envs (apache#13096) Update http error code from 400 to 403 (apache#13061) feat(native-filters): add storybook entry for select filter (apache#13005) feat(native-filters): Time native filter (apache#12992) Force pod restart on config changes (apache#13056) feat(cross-filters): add cross filters (apache#12662) fix(explore): Enable selecting an option not included in suggestions (apache#13029) Improves RTL configuration (apache#13079) Added a note about the ! prefix for breaking changes to CONTRIBUTING.md (apache#13083) chore: lock down npm to v6 (apache#13069) fix: API tests, make them possible to run independently again (apache#13076) fix: add config to disable dataset ownership on the old api (apache#13051) add required * indicator to message content/notif method (apache#12931) fix: Retroactively add granularity param to charts (apache#12960) fix(ci): multiline regex in change detection (apache#13075) feat(style): hide dashboard header by url parameter (apache#12918) fix(explore): pie chart label bugs (apache#13052) fix: Disabled state button transition time (apache#13008) ...
SUMMARY
We ran into some data accuracy issues in our environment where time range filters wouldn't apply to certain charts (like Big Number) when set on a dashboard. We determined that this was because these charts somehow didn't have a granularity param set on them, so the time range wasn't appropriately set on the chart query. This was especially dangerous, because the chart showed as applying the filter, but it wasn't actually applied to the query.
The fix is this db migration that adds the granularity param onto charts that should have one but don't.
TEST PLAN
Before the migration, see big number charts without the granularity or granularity_sqla param not filter by time range correctly.
Run the migration, and see the dashboard perform the filter appropriately.
ADDITIONAL INFORMATION
to: @john-bodley @ktmud @graceguo-supercat @villebro
cc: @junlincc