-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Use RLS clause instead of ID for cache key #25229
fix: Use RLS clause instead of ID for cache key #25229
Conversation
superset/security/manager.py
Outdated
@@ -2083,28 +2083,27 @@ def get_rls_filters(self, table: "BaseDatasource") -> list[SqlaQuery]: | |||
) | |||
return query.all() | |||
|
|||
def get_rls_ids(self, table: "BaseDatasource") -> list[int]: | |||
def get_rls_filter_clauses(self, table: "BaseDatasource") -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we write a small unit test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
superset/security/manager.py
Outdated
return ids | ||
clauses = [f.clause for f in self.get_rls_filters(table)] | ||
clauses.sort() # Combinations rather than permutations | ||
return clauses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can still sort by ID but return a list of clause. Is there any case where clause can be None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clause is not nullable at the DB level, but I think that's still a good idea since there could in theory be an edge case where there are different RLS filters with the same clause but different id's/type (base vs regular).
Updated to sort by id and concatenate the id with the clause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for fixing
superset/security/manager.py
Outdated
return ids | ||
filters = self.get_rls_filters(table) | ||
filters.sort(key=lambda f: f.id) # Combinations rather than permutations | ||
str_reps = [f"{f.id}-{f.clause}" for f in filters] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a totally unnecessary perf optimization, but do we need the ids here?I think just using the clause and sorting them should be sufficient, and would avoid a cache miss if two filters are swapped (= id stays the same, but the clauses are interchanged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included the ID here since I was thinking about an edge case where there could be two different RLS with the same clause but different type (regular/base) and we'd want to avoid a false positive cache hit.
However I just realized I was thinking about this wrong because the filter type only affects who the filter applies to, not how the clause is applied. I'll remove the id and just sort by clause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but now I've realized there's the group key which actually does affect how the clause is applied, so I think that does actually need to be considered for the cache key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(cherry picked from commit fba66c6)
(cherry picked from commit fba66c6)
* fix: is_select with UNION (apache#25290) (cherry picked from commit bb002d6) * fix: Add explicit ON DELETE CASCADE for dashboard_roles (apache#25320) (cherry picked from commit d54e827) * fix(chart): Supporting custom SQL as temporal x-axis column with filter (apache#25126) Co-authored-by: Kamil Gabryjelski <kamil.gabryjelski@gmail.com> * fix: Use RLS clause instead of ID for cache key (apache#25229) (cherry picked from commit fba66c6) * fix: Improve the reliability of alerts & reports (apache#25239) (cherry picked from commit f672d5d) * fix: DashboardRoles cascade operation (apache#25349) (cherry picked from commit a971a28) * fix: datetime with timezone excel export (apache#25318) Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com> (cherry picked from commit 5ebcd2a) * fix: Workaround for Cypress ECONNRESET error (apache#25399) (cherry picked from commit d76ff39) * fix(sqllab): invalid persisted tab state (apache#25308) (apache#25398) * fix: Rename on_delete parameter to ondelete (apache#25424) (cherry picked from commit 893b45f) * fix: preventing save button from flickering in SQL Lab (apache#25106) (cherry picked from commit 296ff17) * fix: chart import (apache#25425) (cherry picked from commit a4d8f36) * fix: swagger UI CSP error (apache#25368) (cherry picked from commit 1716b9f) * fix: smarter date formatter (apache#25404) (cherry picked from commit f0080f9) * fix(sqllab): invalid start date (apache#25437) * fix(nativeFilters): Speed up native filters by removing unnecessary rerenders (apache#25282) Co-authored-by: JUST.in DO IT <justin.park@airbnb.com> (cherry picked from commit a0eeb4d) * fix(SqlLab): make icon placement even (apache#25372) (cherry picked from commit 11b49a6) * fix: Duplicate items when pasting into Select (apache#25447) (cherry picked from commit 7cf96cd) * fix: update the SQLAlchemy model definition at json column for Log table (apache#25445) (cherry picked from commit e83a76a) * fix(helm chart): set chart appVersion to 3.0.0 (apache#25373) * fix(mysql): handle string typed decimal results (apache#24241) (cherry picked from commit 7eab59a) * fix: Styles not loading because of faulty CSP setting (apache#25468) (cherry picked from commit 0cebffd) * fix(sqllab): error with lazy_gettext for tab titles (apache#25469) (cherry picked from commit ddde178) * fix: Address Mypy issue which is causing CI to fail (apache#25494) (cherry picked from commit 36ed617) * chore: Adds 3.0.1 CHANGELOG * fix: Unable to sync columns when database or dataset name contains `+` (apache#25390) (cherry picked from commit dbe0838) * fix(sqllab): Broken query containing 'children' (apache#25490) (cherry picked from commit b92957e) * chore: Expand error detail on screencapture (apache#25519) (cherry picked from commit ba541e8) * fix: tags permissions error message (apache#25516) (cherry picked from commit 50b0816) * fix: Apply normalization to all dttm columns (apache#25147) (cherry picked from commit 58fcd29) * fix: REST API CSRF exempt list (apache#25590) (cherry picked from commit 549abb5) * fix(RLS): Fix Info Tooltip + Button Alignment on RLS Modal (apache#25400) (cherry picked from commit a6d0e6f) * fix: thubmnails loading - Talisman default config (apache#25486) (cherry picked from commit 52f631a) * fix(Presto): catch DatabaseError when testing Presto views (apache#25559) Co-authored-by: Rui Zhao <zhaorui@dropbox.com> (cherry picked from commit be3714e) * fix(Charts): Set max row limit + removed the option to use an empty row limit value (apache#25579) (cherry picked from commit f556ef5) * fix(window): unavailable localStorage and sessionStorage (apache#25599) * fix: finestTemporalGrainFormatter (apache#25618) (cherry picked from commit 62bffaf) * fix: revert fix(sqllab): Force trino client async execution (apache#24859) (apache#25541) (cherry picked from commit e56e0de) * chore: Updates 3.0.1 CHANGELOG * fix(sqllab): Mistitled for new tab after rename (apache#25523) (cherry picked from commit a520124) * fix(sqllab): template validation error within comments (apache#25626) (cherry picked from commit b370c66) * fix: avoid 500 errors with SQLLAB_BACKEND_PERSISTENCE (apache#25553) (cherry picked from commit 99f79f5) * fix(import): Make sure query context is overwritten for overwriting imports (apache#25493) (cherry picked from commit a0a0d80) * fix: permalink save/overwrites in explore (apache#25112) Co-authored-by: Elizabeth Thompson <eschutho@gmail.com> (cherry picked from commit e58a3ab) * fix(header navlinks): link navlinks to path prefix (apache#25495) (cherry picked from commit 51c56dd) * fix: improve upload ZIP file validation (apache#25658) * fix: warning of nth-child (apache#23638) (cherry picked from commit 16cc089) * fix(dremio): Fixes issue with Dremio SQL generation for Charts with Series Limit (apache#25657) (cherry picked from commit be82657) --------- Co-authored-by: Beto Dealmeida <roberto@dealmeida.net> Co-authored-by: John Bodley <4567245+john-bodley@users.noreply.github.com> Co-authored-by: Zef Lin <zef@preset.io> Co-authored-by: Kamil Gabryjelski <kamil.gabryjelski@gmail.com> Co-authored-by: Jack Fragassi <jfragassi98@gmail.com> Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com> Co-authored-by: JUST.in DO IT <justin.park@airbnb.com> Co-authored-by: Jack <41238731+fisjac@users.noreply.github.com> Co-authored-by: Daniel Vaz Gaspar <danielvazgaspar@gmail.com> Co-authored-by: Stepan <66589759+Always-prog@users.noreply.github.com> Co-authored-by: Corbin Bullard <corbindbullard@gmail.com> Co-authored-by: Gyuil Han <cnabro91@gmail.com> Co-authored-by: Celalettin Calis <celalettin1286@gmail.com> Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com> Co-authored-by: ʈᵃᵢ <tdupreetan@gmail.com> Co-authored-by: Michael S. Molina <michael.s.molina@gmail.com> Co-authored-by: mapledan <mapledan829@gmail.com> Co-authored-by: Igor Khrol <khroliz@gmail.com> Co-authored-by: Rui Zhao <105950525+zhaorui2022@users.noreply.github.com> Co-authored-by: Fabien <18534166+frassinier@users.noreply.github.com> Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com> Co-authored-by: OskarNS <soerensen.oskar@gmail.com>
SUMMARY
Currently, the cache key is only dependent on on the id’s of the RLS filters applied for the current user/table. This means that if an existing RLS filter’s clause is updated, a user with that RLS will still be able to retrieve results that were cached before the RLS was updated since their RLS filter id’s are unchanged, so the cache key doesn’t change. Therefore the user sees results based on the old RLS clause, ignoring the new clause.
This PR changes this behavior so that cache keys are instead dependent on the actual clauses of the RLS filters so that RLS rules apply correctly after an RLS clause is updated.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION