Fix caching related issues #4316

mistercrunch · 2018-01-31T06:33:21Z

Moving to cache the output of the dataframe as opposed to the output of get_data. Caching the output of get_data doesn't work if the basis for the cache_key is the "query object". When changing controls like the moving average configuration, this doesn't affect the query object (same cache key), but it does affect the post processing that append in get_data. If we use the query object as the basis for the cache key, we need to cache the dataframe itself, and re-execute get_data against the cache.

This may not be 100% ready for merging as:

I'm unclear on whether the serialization of the dataframe (using df.to_dict) is 100% clean, there could be bugs where some things get lost through the serde process
FilterBox doesn't get cached anymore as it doesn't follow the common pattern of using `get_df

@john-bodley

Also the introduction of a rendered chartState seemed to have broken the cache tag and the row label in the UI.

mistercrunch · 2018-01-31T06:35:57Z

It seems like we may want to re-think/cleanup the caching logic a bit.

mistercrunch · 2018-02-01T19:18:24Z

For the record, I'm doing more work on this. Moving to pickling dataframes as other methods of serializations seem incomplete/risky.

graceguo-supercat · 2018-02-01T21:18:17Z

@mistercrunch Hi Max, do you mind if i create a another PR just to fix JS part? So this PR focus on cache issue?

mistercrunch · 2018-02-02T22:23:47Z

@john-bodley I think I've got it mostly nailed down now at this point. This turned out to be a much more complex headache then I thought originally. Especially in the area of caching properly multi-queries visualizations.

mistercrunch · 2018-02-02T22:24:07Z

superset/assets/package.json

@@ -93,8 +93,8 @@
    "react-sortable-hoc": "^0.6.7",
    "react-split-pane": "^0.1.66",
    "react-syntax-highlighter": "^5.7.0",
-    "react-virtualized": "^9.3.0",
-    "react-virtualized-select": "^2.4.0",
+    "react-virtualized": "9.3.0",


Unrelated but that fixed the build

john-bodley · 2018-02-05T19:49:23Z

@mistercrunch overall this LGTM. We'll probably have to test it extensively in our staging environment. Note there's a few flake8, pylint issues.

michellethomas · 2018-03-16T06:23:16Z

superset/viz.py

+        self._any_cache_key = None
+        self._any_cached_dttm = None
+
+        self.run_extra_queries()


@mistercrunch @john-bodley I'm finding that when a dashboard gets loaded with a filter box viz or a line chart with a time comparison, queries get executed before the dashboard template even gets rendered (resulting in a blank page that is slow to load because it's executing queries). I think this change is causing the issue.

When getting the dashboard data, we get the data for each slice which gets the viz.data and calls run_extra_queries on init and executes the queries. So the dashboard template doesn't get rendered until all of the queries run.

https://github.com/apache/incubator-superset/blob/master/superset/models/core.py#L173

I'm not quite sure how to resolve this because it sounds like this was specifically added here because it needs to run in the init. Any suggestions?

Oh wow right. This should not be here. Let me take a shot at it.

This should do it: #4627 , mind trying it on you side?

thanks for the quick fix! I'll test

This was referenced Jan 31, 2018

Add frontend logging functions and log loading time for Dashboard and explore view #4226

Merged

Add caching at df level #4280

Closed

mistercrunch changed the title ~~Fix caching issues~~ [WiP] fix caching issues Feb 2, 2018

mistercrunch force-pushed the cache_df branch 2 times, most recently from 78815f9 to 8b5757c Compare February 2, 2018 22:21

mistercrunch commented Feb 2, 2018

View reviewed changes

mistercrunch force-pushed the cache_df branch 3 times, most recently from 75dddcd to af11c76 Compare February 6, 2018 18:05

mistercrunch changed the title ~~[WiP] fix caching issues~~ Fix caching related issues Feb 6, 2018

mistercrunch force-pushed the cache_df branch from af11c76 to 4ae883d Compare February 6, 2018 18:22

Fix caching issues

a5ba965

mistercrunch force-pushed the cache_df branch from 444e441 to a5ba965 Compare February 7, 2018 22:14

mistercrunch merged commit 2e172d7 into apache:master Feb 7, 2018

mistercrunch deleted the cache_df branch February 7, 2018 22:49

raffas mentioned this pull request Feb 9, 2018

[BugFix] #4391 FilterBox: An error occurred while rendering the visualization: TypeError: Cannot read property 'xxx' of undefined #4395

Closed

michellethomas reviewed Mar 16, 2018

View reviewed changes

michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018

Fix caching issues (apache#4316)

e6a888a

wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018

Fix caching issues (apache#4316)

fa97204

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.23.0 labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix caching related issues #4316

Fix caching related issues #4316

mistercrunch commented Jan 31, 2018 •

edited

Loading

mistercrunch commented Jan 31, 2018

mistercrunch commented Feb 1, 2018

graceguo-supercat commented Feb 1, 2018

mistercrunch commented Feb 2, 2018

mistercrunch Feb 2, 2018

john-bodley commented Feb 5, 2018 •

edited

Loading

michellethomas Mar 16, 2018

mistercrunch Mar 16, 2018

mistercrunch Mar 16, 2018

michellethomas Mar 16, 2018

Fix caching related issues #4316

Fix caching related issues #4316

Conversation

mistercrunch commented Jan 31, 2018 • edited Loading

mistercrunch commented Jan 31, 2018

mistercrunch commented Feb 1, 2018

graceguo-supercat commented Feb 1, 2018

mistercrunch commented Feb 2, 2018

mistercrunch Feb 2, 2018

Choose a reason for hiding this comment

john-bodley commented Feb 5, 2018 • edited Loading

michellethomas Mar 16, 2018

Choose a reason for hiding this comment

mistercrunch Mar 16, 2018

Choose a reason for hiding this comment

mistercrunch Mar 16, 2018

Choose a reason for hiding this comment

michellethomas Mar 16, 2018

Choose a reason for hiding this comment

mistercrunch commented Jan 31, 2018 •

edited

Loading

john-bodley commented Feb 5, 2018 •

edited

Loading