Support shard request cache for queries with DLS and FLS #70191

ywangd · 2021-03-10T02:17:20Z

Today the shard request cache is disabled by RequestInterceptor when it detects any DLS/FLS is involved in the query. We'd like to enable the cache with changes proposed by this PR. This is labelled as "draft" because we would like to validate the overall approach first before polishing it. In a nutshell, the changes are as the follows:

Add a new cacheModifier field to ShardSearchRequest.
The ShardSearchRequestInterceptor sets the value of cacheModifier by essentially serialising the DLS queries and FLS field definitions.
The cacheModifier field becomes part of the ShardSearchRequeset#cacheKey method. Hence two otherwise identical queries will be cached with different keys if their cacheModifiers are different.

@tvernum came up with the overall idea and did most of the code changes. I have only done some tweaks so far. But I'll keep working on it and see it through (if we agree on the proposed changes).

Since this is just a draft, there are many things we glossed over. Other than tests and docs, here are a few things that we talked about didn't have time to work on:

How does this work with aliases? For example, two aliases pointing to the same concrete index but have different DLS/FLS applied to them.
We didn't attempt to normalise the DLS queries and FLS field definitions. Hence it is possible that two functional identical queries would result into different cache entries. The same thing can happen to FLS field definition as well. We generally consider this acceptable because it seems the overall cache stratgey is safety over efficiency.
The computation of CacheKey itself may need to be cached if it is proved to be expensive enough
We haven't evaluated whether increase the CacheKey size would negatively impact the existing cache
Cache efficiency - Queries with DLS/FLS will increase the variety of queries which could lead to higher cache eviction rate or even cache thrashing (thinking templated DLS query). But we are not sure whether the concern is justified because similar issue could already happen today if users send in large number of different queries.

Resolves: #44581

Historically we have disabled the shard request cache when FLS/DLS is enabled. This commit changes the behaviour so that caching is enabled, but takes into account the FLS/DLS context of the request.

server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java

.../java/org/elasticsearch/xpack/core/security/authz/permission/FieldPermissionsDefinition.java

jimczi

Thanks for tackling this @ywangd . I left a comment on how we plug the cache modifier. I think we can make it cleaner with a clear extension in our plugins.

server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java

mayya-sharipova

@ywangd Thanks for PR! Exciting development! Nice to see a progress on this! Your approach looks simple and understandable to me. But I wonder if what Jim suggested is a more clean way.

How does this work with aliases? For example, two aliases pointing to the same concrete index but have different DLS/FLS applied to them.

Wouldn't bytes in ShardSearchRequest.cacheModifier be different in this case for each different alias?

...c/main/java/org/elasticsearch/xpack/security/authz/interceptor/SearchRequestInterceptor.java

...n/java/org/elasticsearch/xpack/security/authz/interceptor/ShardSearchRequestInterceptor.java

ywangd · 2021-03-17T22:34:11Z

@elasticmachine update branch

…dls-req-cache

ywangd · 2021-03-21T23:29:33Z

@jimczi I updated the PR based on your suggestion of adding a new SearchPlugin extension point. A new class, DlsFlsRequestCacheDifferentiator, is added to support this new extension point. Its code logic is largely a copy/paste from the FieldAndDocumentLevelSecurityRequestInterceptor and also most other changes remain the same. Please let me know whether the new changes look right to you. Once we are in agreement on the high level approach, I'll work on this PR to make it formal. Thanks!

ywangd · 2021-03-22T00:00:25Z

I just realised a BWC implication with the approach of SearchPlugin extesion point. The extesion works by hook into the cacheKey computation process without changing the ShardSearchRequest itself. This means, on a node where the change is available, it will send the ShardSearchRequest as is to a remote node. On the remote node, we have three possible scenarios:

The remote node also has changes of this PR. This is the ideal case and everything should just work.
The remote node does not have changes of this PR, but does have changes of Improve shard level request cache efficiency #69505 (v6.8.15 or v7.11.2+). Things will still work because the remote node will intercept the ShardSearchRequest to disable the request cache.
The remote node does not have any of above changes (e.g. v7.11.0). This is where thing will break because the remote node does not intercept the ShardSearchRequest and the caching behaviour will behave as if there is no DLS/FLS.

Senario 3 basically says that we cannot backport changes of this PR into 7.x, because we support 7.x talking to any v7.1+ node and this means we will run into scenario 3. Since we support wire compatibility between current major and last minor of previous major, these changes should be safe with v8.0. Because it should only communicate to 7.last nodes and this is the safe secnario 2.

tvernum · 2021-03-22T00:10:17Z

Senario 3 basically says that we cannot backport changes of this PR into 7.x

For non-CCS cases you should be able to work around this with a simple check on the ClusterState (DiscoveryNodes.getMinNodeVersion), this new behaviour should simply be disabled if the cluster has not completely upgraded to ~~the latest~~ a supported version.

I'm not sure how easy it is to do the same for CCS, but we can work something out.

ywangd · 2021-03-22T00:22:51Z

Senario 3 basically says that we cannot backport changes of this PR into 7.x

For non-CCS cases you should be able to work around this with a simple check on the ClusterState (DiscoveryNodes.getMinNodeVersion), this new behaviour should simply be disabled if the cluster has not completely upgraded to ~~the latest~~ a supported version.

I'm not sure how easy it is to do the same for CCS, but we can work something out.

Right. I just had another thought on it and found out scenario 3 is not specific to the search plugin extension approach. The same thing can happen with the original approach of adding a cacheModifier to ShardSearchRequest. We need to handle the BWC issue regardless, which means we cannot delete the interceptor code 😞 because it needs to work depending on the minNodeVersion.

I'll comment on the CCS in a separate channel.

ywangd · 2021-06-28T05:18:50Z

@tvernum I have updated the PR as discussed, i.e. disable caching for stored script, but otherwise building the cache key using the evaluated script result. The main changes are in DocumentPermissions and ShardSearchRequestInterceptor. Also added some more tests to cover the new scenarios.

...rc/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java

tvernum · 2021-06-28T06:13:55Z

...rc/main/java/org/elasticsearch/xpack/core/security/authz/permission/DocumentPermissions.java

+    public void buildCacheKey(StreamOutput out) throws IOException {
+        assert false == (queries == null && limitedByQueries == null) : "one of queries and limited-by queries must be non-null";
+        if (queries != null) {
+            assert evaluatedQueries != null : "queries are not evaluated";


This feels wrong. It means that it's only possible to call this method if you have called some other method first.

But in fact there's no reason for the caller to know that they should do that, or to even expect them to be linked.

It might be what we need to do right now, but I'm very reluctant to leave it this way.

The obvious answers are to either:

Inject enough services into the constructor (or a builder) to evaluate the queries immediately

Split this class into 2, so that there's a separate EvaluatedDocumentPermissions class that implements CacheKey.

I don't know if either of those are feasible.

It's certainly non-ideal. But it is not really feasible to go with either of your options either. I thought about first one but it's cascading changes all over the place and it is questionable to inject ScriptService everywhere. The second option does not completely get rid of the "your must call some other method first" problem. It just moves it to a different place.

I had another idea initially but didn't go with it. But maybe it is a better approach after all: We can modify the buildCacheKey signature to take User and ScriptService. They are not really needed at code level in most places and entirely not needed in practice, that's why I didn't go with it because they seem to be superfluous and I was considering the problem you raised to be less of an issue (e.g. some object needs to be init after instantiation to be really usable). It's a balancing act, right now I am OK with either one.

As discussed, we agree to change the signature of buildCacheKey to pass in a DlsQueryEvaluationContext. Overall this change is the least intrusive.

...src/main/java/org/elasticsearch/xpack/core/security/authz/support/DLSRoleQueryValidator.java

Shard level request cache is now generally supported for queries with DLS and/or FLS. The request cache is now enabled following the same rule as a regular search w/o DLS/FLS except following few scenarios where the request cache will still be disabled: 1. DLS query uses a stored script 2. The search targets any remote indices 3. The cluster has any nodes older than v7.11.2 It is worth noting that the caching behaviour is overall safety over efficiency. This means two functional equivalent set of DLS or FLS permissions can possibly result into different cache entries. We consider this a better tradeoff due to higher priorities for correctness and security.

…4618) Shard level request cache is now generally supported for queries with DLS and/or FLS. The request cache is now enabled following the same rule as a regular search w/o DLS/FLS except following few scenarios where the request cache will still be disabled: 1. DLS query uses a stored script 2. The search targets any remote indices 3. The cluster has any nodes older than v7.11.2 It is worth noting that the caching behaviour is overall safety over efficiency. This means two functional equivalent set of DLS or FLS permissions can possibly result into different cache entries. We consider this a better tradeoff due to higher priorities for correctness and security.

The test can now be re-enabled since the changes from elastic#70191 are now backported.

Shard level request cache is now generally supported after elastic#70191

Shard level request cache is now generally supported after #70191 This PR updates the security limitation to state the effect.

Shard level request cache is now generally supported after elastic#70191 This PR updates the security limitation to state the effect.

Shard level request cache is now generally supported after #70191 This PR updates the security limitation to state the effect. Co-authored-by: Yang Wang <yang.wang@elastic.co>

tvernum and others added 4 commits February 23, 2021 18:15

[POC] Enabled Shard Request Cache with DLS/FLS

d350e86

Historically we have disabled the shard request cache when FLS/DLS is enabled. This commit changes the behaviour so that caching is enabled, but takes into account the FLS/DLS context of the request.

tweak

41f7b36

Merge remote-tracking branch 'origin/master' into dls-req-cache

1526579

Split shardsearch and search interceptors, more tweak

76de812

ywangd added >enhancement :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC v8.0.0 v7.13.0 labels Mar 10, 2021

ywangd requested review from tvernum, mayya-sharipova and jimczi March 10, 2021 02:17

ywangd commented Mar 10, 2021

View reviewed changes

server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java Outdated Show resolved Hide resolved

ywangd commented Mar 10, 2021

View reviewed changes

.../java/org/elasticsearch/xpack/core/security/authz/permission/FieldPermissionsDefinition.java Outdated Show resolved Hide resolved

ywangd mentioned this pull request Mar 14, 2021

New authentication check for search functions that bypass TermDocs #70221

Closed

jimczi reviewed Mar 16, 2021

View reviewed changes

server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/search/internal/ShardSearchRequest.java Outdated Show resolved Hide resolved

mayya-sharipova reviewed Mar 17, 2021

View reviewed changes

...c/main/java/org/elasticsearch/xpack/security/authz/interceptor/SearchRequestInterceptor.java Outdated Show resolved Hide resolved

...n/java/org/elasticsearch/xpack/security/authz/interceptor/ShardSearchRequestInterceptor.java Outdated Show resolved Hide resolved

address feedback before exploring the plugin extension approach

fca4079

elasticmachine and others added 5 commits March 17, 2021 18:34

Merge branch 'master' into dls-req-cache

5cc3b4c

Merge remote-tracking branch 'origin/master' into dls-req-cache

741bc12

Merge branch 'dls-req-cache' of github.com:ywangd/elasticsearch into …

fdb6b0b

…dls-req-cache

Use search plugin extension point instead of interceptor

8cceb81

fix tests

5e72414

ywangd added 2 commits March 22, 2021 10:32

Merge remote-tracking branch 'origin/master' into dls-req-cache

592e775

remove cacheModifier field

4c65bc4

reinstate interceptor code for bwc

31adadf

ywangd added 7 commits June 27, 2021 13:20

Fix for template role query with painless

9a80d49

checkstyle

27f8056

fix tests

933658a

fix more tests

9724956

disable request cache if DLS role query uses painless

7a42567

tweak

7a99042

tweak comments

b12fe57

ywangd requested a review from tvernum June 28, 2021 05:18

tvernum approved these changes Jun 28, 2021

View reviewed changes

address feedback

1aa1193

ywangd merged commit 4a8ff0f into elastic:master Jun 28, 2021

ywangd added a commit to ywangd/elasticsearch that referenced this pull request Jun 29, 2021

[Test] Re-enable muted get user priv test

5d91571

The test can now be re-enabled since the changes from elastic#70191 are now backported.

ywangd mentioned this pull request Jun 29, 2021

[Test] Re-enable muted get user priv test #74724

Closed

ywangd added a commit to ywangd/elasticsearch that referenced this pull request Jun 30, 2021

[Doc] Update security limit for search with DLS/FLS

261ccd8

Shard level request cache is now generally supported after elastic#70191

ywangd mentioned this pull request Jun 30, 2021

[Doc] Update security limit for search with DLS/FLS #74725

Merged

ywangd added a commit that referenced this pull request Jul 6, 2021

[Doc] Update security limit for search with DLS/FLS (#74725)

265f783

Shard level request cache is now generally supported after #70191 This PR updates the security limitation to state the effect.

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

tvernum mentioned this pull request Mar 8, 2022

[DOCS] DLS/FLS no longer disable shard request cache. #84741

Closed

arpitbhardwaj mentioned this pull request Oct 13, 2022

[DOC]: Removed note saying DLS/FLS disable shard request cache #90885

Merged

slobodanadamovic mentioned this pull request Jan 12, 2023

Lazy compute the index access control #88708

Merged

jakelandis mentioned this pull request May 13, 2024

[7.17] Backport DLS changes #108330

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support shard request cache for queries with DLS and FLS #70191

Support shard request cache for queries with DLS and FLS #70191

ywangd commented Mar 10, 2021 •

edited

Loading

jimczi left a comment

mayya-sharipova left a comment •

edited

Loading

ywangd commented Mar 17, 2021

ywangd commented Mar 21, 2021

ywangd commented Mar 22, 2021

tvernum commented Mar 22, 2021 •

edited

Loading

ywangd commented Mar 22, 2021

ywangd commented Jun 28, 2021

tvernum Jun 28, 2021

ywangd Jun 28, 2021

ywangd Jun 28, 2021

Support shard request cache for queries with DLS and FLS #70191

Support shard request cache for queries with DLS and FLS #70191

Conversation

ywangd commented Mar 10, 2021 • edited Loading

jimczi left a comment

Choose a reason for hiding this comment

mayya-sharipova left a comment • edited Loading

Choose a reason for hiding this comment

ywangd commented Mar 17, 2021

ywangd commented Mar 21, 2021

ywangd commented Mar 22, 2021

tvernum commented Mar 22, 2021 • edited Loading

ywangd commented Mar 22, 2021

ywangd commented Jun 28, 2021

tvernum Jun 28, 2021

Choose a reason for hiding this comment

ywangd Jun 28, 2021

Choose a reason for hiding this comment

ywangd Jun 28, 2021

Choose a reason for hiding this comment

ywangd commented Mar 10, 2021 •

edited

Loading

mayya-sharipova left a comment •

edited

Loading

tvernum commented Mar 22, 2021 •

edited

Loading