[SIEM] Meta issue for saved object needs for large lists #64715

FrankHassanabad · 2020-04-28T22:52:18Z

This is meta ticket around ad-hoc requirements and feature requests from Elastic Security for saved object support of large lists such as large list values.

Meta issue for saved objects improvements which has a lot of these requests from other teams:
#61716

Data Index implementation we have merged right now that does not use Saved Objects behind a feature flag to keep teams moving and not blocked:
#62552

Support for > 10k (Nk) objects (done in #86301)

Use case:
As a list user, I will be uploading different large list values that contain IP, host names, etc... These list values can contain > 10k items and this could be even larger such as >200k. As a list user I will be uploading and appending/changing list values as well as exporting > 10k items at a time using the provided REST streaming API.

Possible technical solutions:
Support search after within the find API or "search after" directly as a new complementary API next to the existing find API.

Support for delete by query

Use case:
As a list user, I will be uploading multiple lists where each list can contain large list values using a list_id to disambiguate between the lists. From time to time I will be deleting entire large list values by their list_id.

Possible technical solutions:
Support delete by query. We will have user list items from different lists mixed together within one saved object type and these will be distinguishable from which list they belong to using their "list_id".

These list items will need to be deleted by their key of "list_id" and we would like it if we could delete them all at once rather than calling back and forth to get each list item id and deleting them in batches. This can cause a lot of network traffic and possible bugs/issues if Kibana is rebooted or errors out half way during the process. It would be preferable to delete by query all at once and let Elastic Search do its thing.

Support update by query

Technically we do not use this yet within our data index implementation but we would have this if we used a de-normalized format. We currently use a normalized format

Use case:
As a list user, I will need to update list items selectively in bulk using their list_id such as all of the individual list item's names and descriptions that collectively that belong to a particular list.

Possible technical solutions:
Support update by query.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-04-28T22:53:49Z

Pinging @elastic/siem (Team:SIEM)

elasticmachine · 2020-04-28T22:53:54Z

Pinging @elastic/kibana-platform (Team:Platform)

elasticmachine · 2020-04-28T22:55:22Z

Pinging @elastic/endpoint-response (Team:Endpoint Response)

kobelb · 2020-04-29T18:29:12Z

Do we need these lists to be separate documents/saved-objects? Are we generally consuming these lists in their entirety?

FrankHassanabad · 2020-04-29T19:53:19Z

Do we need these lists to be separate documents/saved-objects?

The structure for lists is different than the structure for list items:
List mapping
List item mapping

If we want to de-normalize it and have name,description,tags on each list item then the trade off is that when someone calls a REST request to update a single list name,description,tags we will have to do a "update by query" to update each "list item"'s name, description. For displaying the top level lists and not list items we will then be performing aggregations against the "list_id" field first, and then grabbing the first record's "name, description, tags, etc..." found from the records and make the assumption that all of the list items have the same denormalized but duplicated "name, description, tags, etc..."

...or...

We would use the mapping for list and list items has a single super set mapping where we begin branching off of a "type" within the same saved object collection and storing both list and list item together within the super set and query against "type" to figure out when we are a list vs a list item that belongs to a list within a single saved object type/index.

Either way we still need a "delete by query" and "search after" but I can see either of those two other options being a way to reduce down to a super set mapping or to reduce our existing data index implementation to a single data index from two data indexes that we have right now.

Are we generally consuming these lists in their entirety?

If the question is are we going to generally iterate over the entire list items all the time, that would be a 'no'.

The user will have multiple lists and each list contains multiple list items. The user can do CRUD against any individual list or against any individual list item within a list or subsets of data.

example CRUD operations against a single list are things like:

update a single list name
update a single list description
delete the entire list which will also delete all the list items

example CRUD operations against a single list item are things like:

add a new list item to an existing list
update a single list item
query against a set of list items using CIDR if the list type is that of "ip"
delete a single list item or against a set of them using CIDR if the list type is that of "ip"

All of those operations are working against the merged data index code we have right now that is behind a feature flag if you want to play around with it these curl scripts

FrankHassanabad · 2020-07-29T15:31:17Z

"Reviewed by Frank Hassanabad on 7/29/2020, still valid as of this date" We have implemented several workarounds and TODO's on our side but still would like these things added for us to utilize so we can remove tech debt.

rudolf · 2021-03-15T20:55:55Z

Updated the issue now that we have search_after support #86301

FrankHassanabad added Feature:Saved Objects Meta Team:SIEM labels Apr 28, 2020

FrankHassanabad added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Apr 28, 2020

FrankHassanabad added the Team:Endpoint Response Endpoint Response Team label Apr 28, 2020

spong mentioned this issue May 9, 2020

[SIEM][Detections] Create Exception List API #65938

Closed

rudolf mentioned this issue Jul 6, 2020

Initial server-side support for sharing saved-objects phase 1.5 #66089

Merged

jinmu03 mentioned this issue Jul 6, 2020

[Meta] Saved Objects Improvement #61716

Closed

15 tasks

kobelb mentioned this issue Oct 16, 2020

Reasons for not using saved objects for storing kibana data #80912

Open

MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Oct 27, 2020

mshustov mentioned this issue Nov 5, 2020

Add support to SavedObjects.find for fetching more than 10k objects #77961

Closed

mshustov mentioned this issue Nov 12, 2020

Core should expose a dedicated Elasticsearch client for interacting with Kibana system indices #82716

Closed

FrankHassanabad closed this as not planned Won't fix, can't repro, duplicate, stale Feb 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SIEM] Meta issue for saved object needs for large lists #64715

[SIEM] Meta issue for saved object needs for large lists #64715

FrankHassanabad commented Apr 28, 2020 •

edited by rudolf

Loading

elasticmachine commented Apr 28, 2020

elasticmachine commented Apr 28, 2020

elasticmachine commented Apr 28, 2020

kobelb commented Apr 29, 2020

FrankHassanabad commented Apr 29, 2020

FrankHassanabad commented Jul 29, 2020

rudolf commented Mar 15, 2021 •

edited

Loading

[SIEM] Meta issue for saved object needs for large lists #64715

[SIEM] Meta issue for saved object needs for large lists #64715

Comments

FrankHassanabad commented Apr 28, 2020 • edited by rudolf Loading

Support for > 10k (Nk) objects (done in #86301)

Support for delete by query

Support update by query

elasticmachine commented Apr 28, 2020

elasticmachine commented Apr 28, 2020

elasticmachine commented Apr 28, 2020

kobelb commented Apr 29, 2020

FrankHassanabad commented Apr 29, 2020

FrankHassanabad commented Jul 29, 2020

rudolf commented Mar 15, 2021 • edited Loading

FrankHassanabad commented Apr 28, 2020 •

edited by rudolf

Loading

rudolf commented Mar 15, 2021 •

edited

Loading