-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIEM] Meta issue for saved object needs for large lists #64715
Comments
Pinging @elastic/siem (Team:SIEM) |
Pinging @elastic/kibana-platform (Team:Platform) |
Pinging @elastic/endpoint-response (Team:Endpoint Response) |
Do we need these lists to be separate documents/saved-objects? Are we generally consuming these lists in their entirety? |
The structure for lists is different than the structure for list items: If we want to de-normalize it and have ...or... We would use the mapping for list and list items has a single super set mapping where we begin branching off of a "type" within the same saved object collection and storing both list and list item together within the super set and query against "type" to figure out when we are a list vs a list item that belongs to a list within a single saved object type/index. Either way we still need a "delete by query" and "search after" but I can see either of those two other options being a way to reduce down to a super set mapping or to reduce our existing data index implementation to a single data index from two data indexes that we have right now.
If the question is are we going to generally iterate over the entire list items all the time, that would be a 'no'. The user will have multiple lists and each list contains multiple list items. The user can do CRUD against any individual list or against any individual list item within a list or subsets of data. example CRUD operations against a single list are things like:
example CRUD operations against a single list item are things like:
All of those operations are working against the merged data index code we have right now that is behind a feature flag if you want to play around with it these curl scripts |
"Reviewed by Frank Hassanabad on 7/29/2020, still valid as of this date" We have implemented several workarounds and TODO's on our side but still would like these things added for us to utilize so we can remove tech debt. |
Updated the issue now that we have search_after support #86301 |
This is meta ticket around ad-hoc requirements and feature requests from Elastic Security for saved object support of large lists such as large list values.
Meta issue for saved objects improvements which has a lot of these requests from other teams:
#61716
Data Index implementation we have merged right now that does not use Saved Objects behind a feature flag to keep teams moving and not blocked:
#62552
Support for > 10k (Nk) objects(done in #86301)Use case:
As a list user, I will be uploading different large list values that contain IP, host names, etc... These list values can contain > 10k items and this could be even larger such as >200k. As a list user I will be uploading and appending/changing list values as well as exporting > 10k items at a time using the provided REST streaming API.
Possible technical solutions:
Support search after within the find API or "search after" directly as a new complementary API next to the existing find API.
Support for delete by query
Use case:
As a list user, I will be uploading multiple lists where each list can contain large list values using a
list_id
to disambiguate between the lists. From time to time I will be deleting entire large list values by theirlist_id
.Possible technical solutions:
Support delete by query. We will have user list items from different lists mixed together within one saved object type and these will be distinguishable from which list they belong to using their "list_id".
These list items will need to be deleted by their key of "list_id" and we would like it if we could delete them all at once rather than calling back and forth to get each list item id and deleting them in batches. This can cause a lot of network traffic and possible bugs/issues if Kibana is rebooted or errors out half way during the process. It would be preferable to delete by query all at once and let Elastic Search do its thing.
Support update by query
Technically we do not use this yet within our data index implementation but we would have this if we used a de-normalized format. We currently use a normalized format
Use case:
As a list user, I will need to update list items selectively in bulk using their
list_id
such as all of the individual list item's names and descriptions that collectively that belong to a particular list.Possible technical solutions:
Support update by query.
The text was updated successfully, but these errors were encountered: