feat(thumbnails): add support for user specific thumbs #22328

villebro · 2022-12-05T11:38:21Z

SUMMARY

This PR adds support for generating user-specific thumbnails. This is a typical requirement in environments where some form of user impersonation is being used, and sharing thumbnails across all users with access to the same dashboards/charts could leak sensitive data.

This PR does the following:

Renames ReportScheduleExecutor to ExecutorType so that it can be reused for thumbnails. Also moves the utils from the reports package to the tasks package, as it's shared with thumbnails now.
Adds a new package thumbnails to contain the thumbs-specific types etc. Also move the thumbnail task module here and deprecate the old one (it will still work, but will now emit a deprecation warning).
Adds a new executor type called CURRENT_USER which corresponds to the logged-in user that initiated the request. This user will be undefined for Alerts & Reports (=Celery has initiated those), but for thumbnails, this is the user that requested the thumbnail.
Adds the following config options:
- THUMBNAIL_EXECUTE_AS - similar config as for Alerts & Reports
- THUMBNAIL_DASHBOARD_DIGEST_FUNC: callback for generating custom digests for dashboards. This is handy if a deployment wants to use different hashing functions or use advanced logic for deciding if a thumbnail can be shared across a larger user pool or not.
- THUMBNAIL_CHART_DIGEST_FUNC: callback for generating custom digests for charts

By default the digests will stay unchanged, as the new default value for THUMBNAIL_EXECUTE_AS = [ExecutorType.SELENIUM]. However, when setting it to [ExecutorType.CURRENT_USER], the username will be added to the unique_string prior to hashing to make it unique per user.

AFTER

This chart thumbnail was cached using the following query on a Trino database connection using user impersonation with the following virtual dataset:

select concat('Database: ', current_user) as user, 1 as num union all
select 'Jinja: {{ current_username() }}' as user, 2 as num

As can be seen, both the current_user (rendered by Trino) and {{ current_username() }} (rendered by Superset) both show the user as being v_brofeldt, i.e. not a service account.

BEFORE

When I changed the chart to reference a postgres database with basic auth using the username postgres and THUMBNAIL_EXECUTE_AS = [ExecutorType.SELENIUM] (=same as current behavior), the result was as follows:

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2022-12-05T12:03:17Z

Codecov Report

Merging #22328 (74981e5) into master (1014a32) will increase coverage by 0.01%.
The diff coverage is 81.36%.

@@            Coverage Diff             @@
##           master   #22328      +/-   ##
==========================================
+ Coverage   66.89%   66.90%   +0.01%     
==========================================
  Files        1847     1850       +3     
  Lines       70611    70677      +66     
  Branches     7749     7749              
==========================================
+ Hits        47233    47285      +52     
- Misses      21362    21376      +14     
  Partials     2016     2016

Flag	Coverage Δ
hive	`52.47% <36.64%> (-0.03%)`	⬇️
mysql	`77.97% <71.42%> (-0.01%)`	⬇️
postgres	`78.03% <71.42%> (-0.01%)`	⬇️
presto	`52.37% <36.64%> (-0.03%)`	⬇️
python	`81.23% <81.36%> (-0.01%)`	⬇️
sqlite	`76.50% <71.42%> (-0.01%)`	⬇️
unit	`50.92% <66.45%> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/reports/commands/exceptions.py	`98.36% <ø> (-0.03%)`	⬇️
superset/reports/types.py	`100.00% <ø> (ø)`
superset/tasks/thumbnails.py	`39.02% <22.22%> (-6.14%)`	⬇️
superset/charts/api.py	`86.19% <62.50%> (+0.42%)`	⬆️
superset/models/slice.py	`85.85% <62.50%> (-0.29%)`	⬇️
superset/config.py	`91.46% <80.00%> (-0.47%)`	⬇️
superset/models/dashboard.py	`76.61% <80.00%> (+0.11%)`	⬆️
superset/dashboards/api.py	`92.57% <83.33%> (+0.04%)`	⬆️
superset/tasks/utils.py	`91.48% <91.48%> (ø)`
superset/thumbnails/digest.py	`93.54% <93.54%> (ø)`
... and 4 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

villebro · 2022-12-05T14:38:36Z

superset/models/slice.py

+    @classmethod
+    def get(cls, id_: int) -> Slice:
+        session = db.session()
+        qry = session.query(Slice).filter_by(id=id_)
+        return qry.one_or_none()


There is a similar method in Dashboard that bypasses the base filter

villebro · 2022-12-05T14:42:00Z

superset/tasks/utils.py

+def get_executor(
+    executor_types: List[ExecutorType],
+    model: Union[Dashboard, ReportSchedule, Slice],
+    initiator: Optional[str] = None,
+) -> Tuple[ExecutorType, str]:


The reason why this function returns the username as a string, and not the User object, is because this function will be called frequently (e.g. when getting all charts/dashboards). Since we only know the selenium username in the config, we would otherwise have to fetch it from the metastore, causing unnecessary round trips to the metastore.

villebro · 2022-12-05T14:43:25Z

superset/thumbnails/tasks.py

@@ -0,0 +1,101 @@
+# Licensed to the Apache Software Foundation (ASF) under one


This file is really the same as superset/tasks/thumbnails.py before, just updated with the new logic (fetching the executor and overriding the username etc)

villebro · 2022-12-05T14:45:10Z

tests/integration_tests/reports/alert_tests.py

            AlertQueryError(),
        ),
+        (["gamma"], None, [ExecutorType.INITIATOR], AlertQueryError()),


this is really just one added test for the new INITIATOR case, other than that it's just updating the test cases to conform to the new sig of get_executor

villebro · 2022-12-05T14:46:18Z

tests/unit_tests/tasks/test_utils.py

@@ -0,0 +1,323 @@
+# Licensed to the Apache Software Foundation (ASF) under one


This was mostly moved from the previous tests/unit_tests/reports/test_utils.py file, just updating to new return types + adding new relevant test cases.

kamalkeshavani-aiinside · 2022-12-06T01:24:50Z

@villebro Thank you for implementation of this feature. Just sharing our usecase.
In our Superset, we keep the dashboard thumbnails cached everyday for all users for better user experience. We do plan to create a dashboard with user specific content, where we can use this feature of each user generating their own thumbnails.

But this implementation will force the thumbnail generation for all the dashboards for each user, which is not ideal for us.
An ideal solution for our usecase would be control of executor for each dashboard/chart, so the owner of dashboard/chart can decide if that particular dashboard/chart needs INITIATOR as executor for thumbnail. Whether it is actually feasible or not, I will let you decide.

villebro · 2022-12-07T06:57:31Z

Just sharing our usecase. In our Superset, we keep the dashboard thumbnails cached everyday for all users for better user experience. We do plan to create a dashboard with user specific content, where we can use this feature of each user generating their own thumbnails.

But this implementation will force the thumbnail generation for all the dashboards for each user, which is not ideal for us. An ideal solution for our usecase would be control of executor for each dashboard/chart, so the owner of dashboard/chart can decide if that particular dashboard/chart needs INITIATOR as executor for thumbnail. Whether it is actually feasible or not, I will let you decide.

@kamalkeshavani-aiinside we can certainly consider adding this in a future PR. All it would really require is adding a thumbnail_executor field in the Dashboard and Slice models and then add a dropdown in their respective modals (if undefined, it would default to the global config).

Would you be open to working on this feature? I'm happy to provide guidance and review help if needed.

villebro · 2022-12-07T08:46:02Z

superset/tasks/types.py

+from enum import Enum
+
+
+class ExecutorType(str, Enum):


This is the same type as the previous ReportScheduleExecutor, but with the added INITIATOR enum.

superset/config.py

superset/models/slice.py

superset/tasks/utils.py

superset/dashboards/api.py

superset/models/slice.py

superset/thumbnails/digest.py

docs/docs/installation/cache.mdx

superset/models/dashboard.py

villebro · 2022-12-08T07:43:58Z

tests/integration_tests/thumbnails_tests.py

-        dashboard = db.session.query(Dashboard).all()[0]
        self.login(username="admin")
-        uri = f"api/v1/dashboard/{dashboard.id}/thumbnail/{dashboard.digest}/"
-        rv = self.client.get(uri)
+        _, thumbnail_url = self._get_id_and_thumbnail_url(DASHBOARD_URL)
+        rv = self.client.get(thumbnail_url)


Since the digest may now be affected by who is logged in, all tests are updated to fetch the thumbnail URL via the API after login.

superset/config.py

kamalkeshavani-aiinside · 2022-12-09T01:49:02Z

@kamalkeshavani-aiinside we can certainly consider adding this in a future PR. All it would really require is adding a thumbnail_executor field in the Dashboard and Slice models and then add a dropdown in their respective modals (if undefined, it would default to the global config).

Would you be open to working on this feature? I'm happy to provide guidance and review help if needed.

@villebro Thank you for the suggestion. Sure, I can try to work on this with your help.

dpgaspar · 2022-12-09T11:23:20Z

superset/thumbnails/digest.py

+    )
+
+    unique_string = _adjust_string_for_executor(unique_string, executor_type, executor)
+    return md5_sha_from_str(unique_string)


our current dashboard digest is:

@property def digest(self) -> str: """ Returns a MD5 HEX digest that makes this dashboard unique """ unique_string = f"{self.position_json}.{self.css}.{self.json_metadata}" return md5_sha_from_str(unique_string)

Adding an executor will invalidate all current computed dashboard thumbnails.

I would also prefer to bring these changes back to /superset/tasks/thumbnails and avoid introducing the deprecation and underlying breaking change on the task structure.

Adding an executor will invalidate all current computed dashboard thumbnails.

Oof, that was an unintended mistake, the executor was only supposed to be added in the case of CURRENT_USER.

I would also prefer to bring these changes back to /superset/tasks/thumbnails and avoid introducing the deprecation and underlying breaking change on the task structure.

Makes sense - I'll revert the move

michael-s-molina

Code LGMT.

@villebro If you don't mind, it would be a good idea to also get @dpgaspar's approval before merging it. Thank you for the improvement!

villebro · 2022-12-09T21:03:24Z

Code LGMT.

@villebro If you don't mind, it would be a good idea to also get @dpgaspar's approval before merging it. Thank you for the improvement!

Absolutely @michael-s-molina 👍 As I also found a bug today in the PR I'm going to let it simmer over the weekend as I feel there's room for improvement in the tests.

dpgaspar

Looks really good, since this change will invalidate thumbnails cache it would be nice to add a note on UPDATING.md

dpgaspar · 2022-12-14T09:16:08Z

superset/thumbnails/digest.py

+        return func(dashboard, executor_type, executor)
+
+    unique_string = (
+        f"{dashboard.id}\n{dashboard.charts}\n{dashboard.position_json}\n"


will dashboard.charts generate a N+1 issue?

This PR shouldn't change current performance, as this property is already present on the current request payload.

pull-request-size bot added the size/XXL label Dec 5, 2022

villebro changed the title ~~Villebro/thumb selenium~~ feat(thumbnails): add support for user specific thumbs Dec 5, 2022

villebro force-pushed the villebro/thumb-selenium branch 2 times, most recently from 8636996 to 48e90c3 Compare December 5, 2022 14:36

villebro commented Dec 5, 2022

View reviewed changes

villebro force-pushed the villebro/thumb-selenium branch 3 times, most recently from f419302 to 8a67d59 Compare December 5, 2022 17:07

villebro force-pushed the villebro/thumb-selenium branch 2 times, most recently from 47abf41 to 14de647 Compare December 6, 2022 09:10

rusackas requested review from dpgaspar and michael-s-molina December 6, 2022 16:41

villebro commented Dec 7, 2022

View reviewed changes

villebro force-pushed the villebro/thumb-selenium branch from c7aa129 to 10a7ce0 Compare December 7, 2022 09:01

dpgaspar reviewed Dec 7, 2022

View reviewed changes

villebro commented Dec 7, 2022

View reviewed changes

superset/models/dashboard.py Show resolved Hide resolved

villebro force-pushed the villebro/thumb-selenium branch from feb651d to 298705b Compare December 7, 2022 11:44

villebro commented Dec 8, 2022

View reviewed changes

villebro requested a review from dpgaspar December 8, 2022 08:52

michael-s-molina reviewed Dec 8, 2022

View reviewed changes

superset/config.py Outdated Show resolved Hide resolved

michael-s-molina requested a review from eschutho December 8, 2022 17:33

villebro added 2 commits December 8, 2022 20:53

feat(thumbnails): add support for user specific thumbs

176c1df

fixes

a9f37e6

villebro added 5 commits December 8, 2022 20:53

fix tests and add docs

9fcd416

lint

dc6246b

address review comments

28cc56e

streamline tests

94c5e3f

fix config descriptions

5d2ce4e

villebro force-pushed the villebro/thumb-selenium branch from f6bdf66 to 5d2ce4e Compare December 8, 2022 19:00

dpgaspar reviewed Dec 9, 2022

View reviewed changes

move tasks back and fix digest func

fdc2698

villebro force-pushed the villebro/thumb-selenium branch from b9ecbb7 to 67759ed Compare December 9, 2022 16:22

fix alert executor

77d32e5

villebro force-pushed the villebro/thumb-selenium branch from 67759ed to 77d32e5 Compare December 9, 2022 16:31

AAfghahi approved these changes Dec 9, 2022

View reviewed changes

michael-s-molina approved these changes Dec 9, 2022

View reviewed changes

fix digest func and update tests

f0d7099

dpgaspar approved these changes Dec 14, 2022

View reviewed changes

villebro added 2 commits December 14, 2022 14:04

Merge branch 'master' into villebro/thumb-selenium

4c04d61

reorder test and add UPDATING comment

74981e5

villebro merged commit aa0cae9 into apache:master Dec 14, 2022

villebro deleted the villebro/thumb-selenium branch December 14, 2022 13:02

This was referenced Jan 3, 2023

chore: Migrate /superset/search_queries to API v1 #22579

Merged

chore: Migrate /superset/queries/<last_updated_ms> to API v1 #22611

Merged

villebro mentioned this pull request Jan 20, 2023

chore(thumbnails): change default executor to logged in user #22801

Merged

9 tasks

mrmooon mentioned this pull request Mar 9, 2023

docs(alerts and reports): Update ExecutorType class #23323

Merged

9 tasks

mistercrunch added the 🚢 2.1.3 label Feb 18, 2024

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(thumbnails): add support for user specific thumbs #22328

feat(thumbnails): add support for user specific thumbs #22328

villebro commented Dec 5, 2022 •

edited

Loading

codecov bot commented Dec 5, 2022 •

edited

Loading

villebro Dec 5, 2022

villebro Dec 5, 2022

villebro Dec 5, 2022

villebro Dec 5, 2022

villebro Dec 5, 2022 •

edited

Loading

kamalkeshavani-aiinside commented Dec 6, 2022

villebro commented Dec 7, 2022

villebro Dec 7, 2022

villebro Dec 8, 2022

kamalkeshavani-aiinside commented Dec 9, 2022 •

edited

Loading

dpgaspar Dec 9, 2022

villebro Dec 9, 2022

michael-s-molina left a comment

villebro commented Dec 9, 2022

dpgaspar left a comment

dpgaspar Dec 14, 2022

villebro Dec 14, 2022 •

edited

Loading

		@@ -0,0 +1,101 @@
		# Licensed to the Apache Software Foundation (ASF) under one

		@@ -0,0 +1,323 @@
		# Licensed to the Apache Software Foundation (ASF) under one

feat(thumbnails): add support for user specific thumbs #22328

feat(thumbnails): add support for user specific thumbs #22328

Conversation

villebro commented Dec 5, 2022 • edited Loading

SUMMARY

AFTER

BEFORE

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented Dec 5, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

villebro Dec 5, 2022 • edited Loading

Choose a reason for hiding this comment

kamalkeshavani-aiinside commented Dec 6, 2022

villebro commented Dec 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kamalkeshavani-aiinside commented Dec 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michael-s-molina left a comment

Choose a reason for hiding this comment

villebro commented Dec 9, 2022

dpgaspar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

villebro Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

villebro commented Dec 5, 2022 •

edited

Loading

codecov bot commented Dec 5, 2022 •

edited

Loading

villebro Dec 5, 2022 •

edited

Loading

kamalkeshavani-aiinside commented Dec 9, 2022 •

edited

Loading

villebro Dec 14, 2022 •

edited

Loading