-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cloud Posture] Deprecate Elasticsearch transform in Cloud security posture plugin #153875
Comments
Pinging @elastic/kibana-cloud-security-posture (Team:Cloud Security) |
Hey @opauloh, we had similar discussion recently, please go over the following issues: |
Thanks @CohenIdo. After reading each issue carefully, I identified there's one solution that wasn't explored yet: Using one hash field for Back in the AWP team, we have used collapse for the session viewer plugin, as it can be seen here to aggregate the Linux events by unique sessions, and it was a performant solution, working properly even with the index having millions of records. The only requirement is that collapse works as desired with a single field only, so that's why we would need a new field for making the group (in this case a hash of |
Very interesting Paulo! Can't wait to see the results! |
Depending on when we ship this, it could solve a problem we have with ILMs on serverless |
@opauloh can we also track backporting event.code creation to previous packages? |
I'm closing this ticket since we conducted a POC making use of collapse to query data directly from the data stream index and concluded that a few issues with this approach. Summary of our learnings: Collapse API works great for tables, as it can collapse data by an identifier key: Before collapse: After collapse: collapse: {
field: 'event.code',
inner_hits: {
name: 'latest_result_evaluation',
size: 1,
sort: [{ '@timestamp': 'desc' }],
},
}, However, two problems was found: Issue 1: Limit of aggregated data for dashboards and grouped table:In order to have our Dashboard show the correct information we need to perform an aggregation on the identifier key, and then a sub aggregation on the top_hits of the latest event: unique_event_code: {
terms: {
field: 'event.code',
size: 65000,
},
aggs: {
latest_result_evaluation: {
top_hits: {
_source: ['result.evaluation'],
size: 1,
sort: [{ '@timestamp': 'desc' }],
},
},
},
}, That query with a time range filter of This means that when attempting to insert 70k findings records (with 51k unique findings), the ungrouped table worked as expected using collapse: But the dashboard and grouped by resource table didn't work: Throwing the following error on the logs:
Issue 2: filtering by result.evaluation would not guarantee showing the most up-to-date data:If there are multiple findings that were remediated, or past from a passed state to failed state, adding a filter would potentially show deprecated data Example: The most up-to-date finding for this unique key is But when filtering for ConclusionThese 2 issues bring up a big showstopper to move forward with this approach using collapse. Even if we can think of a solution for problem number 2, using telemetry data we already know in advance that the limit of 65k findings in a time range of 26 hours won't work for some users while preventing us from future enhancements as adding more Grouped by visualizations to the findings. The final conclusion is that we don't currently have a way of querying directly from the data stream index with the current model without hitting memory limits for a large data set, this means that the use of The code used during the attempt is on this PR which is now closed. |
Thank you @opauloh for taking the time to summerize your conclusions! |
Summary
This is a proposal to deprecate the use of Elasticsearch transform in the Cloud security posture plugin. Currently, the transform is used to generate the latest findings for each resource.id + rule id, which is then stored in the
logs-cloud_security_posture.findings_latest-*
index.However, the use of transforms adds a layer of complexity to test and maintain it. In addition, we have been facing some issues that the transform doesn't recover itself when upgrading Elastic stack versions.
Our transform has a max_age of
26h
, withresource.id
andrule.id
as unique keys. However, we can achieve the same results using Elasticsearch queries directly in thelogs-cloud_security_posture.findings-*
index by using an@timestamp
filter with aggregation to group the findings byresource.id
andrule.id
and retrieve the latest finding for each group.Benefits:
Approach
There's one solution that wasn't explored yet: Using one hash field for rule.id + resource.id combined with the use of the collapse query option in the Elasticsearch queries. The benefit of using collapse is that it doesn't affect sorting or aggregations like aggregations do, so we don't have any regression on the experience we provide in the dashboard or the findings table.
Back in the AWP team, we have used collapse for the session viewer plugin, as it can be seen here to aggregate the Linux events by unique sessions, and it was a performant solution, working properly even with the index having millions of records.
The only requirement is that collapse works as desired with a single field only, so that's why we would need a new field for making the group (in this case a hash of rule.id + resource.id)
Suggestion: We can use
event.code
to the unique fieldTasks
The text was updated successfully, but these errors were encountered: