Document how to set data retention / ILM policies for integrations #986

mostlyjason · 2021-08-13T17:50:27Z

Description

When users create integrations, they control the retention of their data through ILM policies set on index templates. It's not obvious how to do this because we abstract away these lower level concepts in the UI. We should document how these concepts connect to integrations and how users can set their data retention.

Collaboration

The docs team will lead producing the content
The product team will provide the initial content and the docs team will edit / review
The docs team will define with product team the structure and location, and the product team will provide the initial content
Other (please describe)

Contact Person:

(We need to have a contact person in the product/development team to provide information about how the item to be documented works.)
@hop-dev

Suggested Target Release

7.16

Stakeholders

(Please list any stakeholders for this issue.)
@mostlyjason @leehinman

Related: elastic/kibana#108554

mostlyjason · 2021-08-13T17:52:48Z

FYI @jethr0null since we discussed this in the data lifecycle working group

hop-dev · 2021-09-07T11:07:03Z

On elastic/kibana#108554 there has been some discussion about a more basic approach for customising data retention across all namespaces using the supplied @custom component templates.

Currently by default, each data stream that is part of a package has the metrics index lifecycle policy specified in the matching index template, this means that the policy cannot be overridden by specifying it in the *@custom component template. One exception to this default behaviour is if a package defines their own ILM policy as part of installation (for example the APM package).

Nicholas raised moving the ILM policy to a component template on this PR which would enable a user to override with their own policy. In which case we would be able to consider documenting this option.

The more comprehensive (and complicated!) approach will be to use index templates which are scoped to the namespace, I'll follow with more detailed steps.

hop-dev · 2021-09-07T19:54:41Z

hop-dev · 2021-09-07T20:00:02Z

@jen-huang on step 4 (iv.) above I recommend removing managed and managed_by when a user is overriding the index template for an integrations data stream (as the template is now managed by the user) do you agree with this?

andresrc · 2021-09-09T08:37:45Z

@mostlyjason can you describe the expected collaboration here? We recommend using the provided issue template.

mostlyjason · 2021-09-13T14:43:33Z

@andresrc I updated the description to include the new template format.

@leehinman I tagged you as a stakeholder in the data lifecycle mgmt working group. The goal for this it to document the current process for setting ILM policies for integrations, and we'll refine the UX for this in a later release. Would be nice to get your review on the content above.

hop-dev · 2021-09-22T10:16:12Z

@mostlyjason @joshdover I have laid out instructions for the two options we have here in a very rough way here:

https://gist.github.com/hop-dev/3e7798fd06c13a9acf36594aa797d4ba

before I create draft documentation we need to decide whether we are going to document option 1 or 2 (or both), I'll include a high level description here but more detail can be found in the gist above:

option 1 - use *@custom component templates to set an ILM policy for a data stream across all namespaces, this option is only available for integrations installed by kibana 7.16.
option 2 - use index templates to set an ILM policy for one (or all) namespaces, slightly more complicated but version agnostic I believe.

Both options involve the user making changes per-datastream which isn't ideal but I don't think can be avoided.

mostlyjason · 2021-09-27T15:34:41Z

@hop-dev @joshdover Offering users two different ways seems like we are making the decision process more complex for users. Is there a way we can recommend a single solution that solves most users' needs? When we implement our medium term solution in the UI, which of this approaches will we implement? That might be best from a long term perspective.

From a user needs perspective, I think we need the ability to have different ILM policies per namespace. The whole point of using namespaces is to partition different use cases or teams. For example, I want to retain my production logs for a week but my test environment logs for 24 hours. Another example is that team A wants to retain data for a week and team B wants it for 24 hours.

I'd advocate either going the index template route or change how we set up the component templates to allow namespace support. What is the UX impact of this decision or is it purely technical? Would it impact the ability to upgrade templates over time? Would it help to set up some time to discuss the options live?

joshdover · 2021-10-04T13:02:38Z

I'd advocate either going the index template route or change how we set up the component templates to allow namespace support. What is the UX impact of this decision or is it purely technical? Would it impact the ability to upgrade templates over time? Would it help to set up some time to discuss the options live?

In order to support namespace-specific customizations, we'll need to create a duplicate index templates and component templates for each namespace x data stream. Where we are right now, I doubt we want to be creating these by default since it will result in a very large number of templates for each integration that is installed. There are both technical and UX issues with creating such a large number of templates out of the box:

It creates UX issues in the Index Management UI that are likely solvable (eg. filtering out managed templates by default), however it also adds quite a bit of noise to the /_cat/templates API and other template APIs. We could also consider filtering out managed templates from such APIs by default, but it would be considered a breaking change in Elasticsearch and would need to be considered very carefully.
From a technical standpoint, I believe there is an upper bound to the amount of data that Elasticsearch's "cluster state" can reliably handle. This is where all cluster-wide data is stored, including index templates. That said, I think we need to benchmark this or get some assistance from the Elasticsearch team here to determine the actual limitations.

The workaround would be for Fleet to create the namespace-specific templates on-demand as users opt-in to customizations that require them. This is essentially what option (2) is documenting how to do manually.

All that said, I think we need to test these assumptions and really see at which point these UX and Elasticsearch performance problems begin to arise. We can then determine how to proceed with the the long-term path forward. Our implementation would be far simpler, less likely to break, and easier to support and document if we created all the namespace-specific template in all scenarios. IMO, we should try to find a long-term path forward that allows that option.

When we implement our medium term solution in the UI, which of this approaches will we implement? That might be best from a long term perspective.

Back to this question, I think the answer right now is we'll almost certainly need to either have namespace-specific templates or modify how templates work in some way to avoid that need but still be able to deliver this feature (which feels unlikely to me at this point).

I think it'd be ok to proceed with option (2), even though it is more complicated. I suspect that we'll need to have logic that detects if a user has a template that overrides what they set in our UI (once we have it) and we can handle any manually created templates based on these docs in the same way as other user-created templates.

joshdover · 2021-10-05T14:57:39Z

Yesterday we discussed the different options and need for further investigating for the long-term solution here, but we didn't make a decision on what to do for phase 0 for 7.16. @mostlyjason any opposition to moving forward with option (2)? I believe it's likely to be compatible with any long-term option we end up choosing and gives user on 7.16 the most flexibility to use namespace-specific ILM policies.

hop-dev · 2021-10-11T14:12:53Z

@joshdover @mostlyjason and I have just met and discussed the way forward:

Decisions

We will document option 2 as it offers namespace specific settings
We will instruct the user to create a component template using the <type>-<dataset>-<namespace>@custom naming convention e.g metrics-system.network-production@custom
When duplicating the index template, we will instruct the user to keep it set as "managed"
Currently our index templates are priority 200, we will instruct users to set the namespace specific index template to 250

Notes for future implementation

We are relying on the user applying their custom settings using the @custom component template, however we may have to detect and warn the user if they ave applied settings to the index template itself?

hop-dev · 2021-10-15T13:05:00Z

@mostlyjason @joshdover here is my first draft of the guide: https://docs.google.com/document/d/1t29yHm5rHHUNTiI4DjgowUiI48LoK3X33SkpssP6ocY/edit?usp=sharing

joshdover · 2021-10-18T15:43:13Z

@dedemorton I believe these docs are ready for you to take from here ^. Let us know if you need anything more from us.

One thing we need to know before merging the related PR for adding the link to the UI is where in the Fleet Guide we'll be linking to. Would it be possible to determine the page URL & anchor id before we write the docs so we can add the link in the UI?

mostlyjason · 2021-10-21T14:04:31Z

Just dropping a link to a doc page that should probably be updated https://www.elastic.co/guide/en/fleet/7.15/data-streams.html#data-streams-ilm

bmorelli25 · 2021-10-21T16:46:25Z

Hey everyone! I'll be the technical writer helping out with this issue. Thanks for all of the work so far on this.

@hop-dev Thanks for the draft! I'll take a look and let you know if I need anything else.

@joshdover The link Jason provided is probably a safe bet. If we need to move the content later I can set up redirects.

mostlyjason added the Team:Fleet Label for the Fleet team label Aug 13, 2021

mostlyjason mentioned this issue Aug 13, 2021

[Integrations] Add a link to ILM policies in the integration policy editor elastic/kibana#108554

Open

3 tasks

jen-huang mentioned this issue Aug 26, 2021

[Fleet] Link to documentation on how to set data retention/ILM policies elastic/kibana#110342

Closed

jen-huang added the v7.16.0 label Aug 26, 2021

hop-dev mentioned this issue Sep 7, 2021

[Fleet] Set default settings in component template instead of the index template elastic/kibana#111197

Merged

andresrc added the needs-input label Sep 9, 2021

mostlyjason assigned hop-dev Sep 13, 2021

bmorelli25 self-assigned this Oct 21, 2021

joshdover mentioned this issue Oct 25, 2021

[Fleet] Add link to integration data retention documentation elastic/kibana#115353

Merged

bmorelli25 mentioned this issue Oct 28, 2021

docs: How to set data retention (ILM) policies for integrations #1194

Merged

bmorelli25 closed this as completed in #1194 Oct 29, 2021

bmorelli25 mentioned this issue Nov 9, 2021

Document data streams and custom index lifecycle policies elastic/apm-server#6553

Merged

hop-dev mentioned this issue Nov 18, 2021

[Fleet] Revert custom ILM policy documentation elastic/kibana#119013

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document how to set data retention / ILM policies for integrations #986

Document how to set data retention / ILM policies for integrations #986

mostlyjason commented Aug 13, 2021 •

edited

Loading

mostlyjason commented Aug 13, 2021

hop-dev commented Sep 7, 2021

hop-dev commented Sep 7, 2021 •

edited

Loading

hop-dev commented Sep 7, 2021

andresrc commented Sep 9, 2021

mostlyjason commented Sep 13, 2021

hop-dev commented Sep 22, 2021

mostlyjason commented Sep 27, 2021

joshdover commented Oct 4, 2021

joshdover commented Oct 5, 2021

hop-dev commented Oct 11, 2021

hop-dev commented Oct 15, 2021

joshdover commented Oct 18, 2021

mostlyjason commented Oct 21, 2021

bmorelli25 commented Oct 21, 2021

Document how to set data retention / ILM policies for integrations #986

Document how to set data retention / ILM policies for integrations #986

Comments

mostlyjason commented Aug 13, 2021 • edited Loading

Description

Collaboration

Suggested Target Release

Stakeholders

mostlyjason commented Aug 13, 2021

hop-dev commented Sep 7, 2021

hop-dev commented Sep 7, 2021 • edited Loading

hop-dev commented Sep 7, 2021

andresrc commented Sep 9, 2021

mostlyjason commented Sep 13, 2021

hop-dev commented Sep 22, 2021

mostlyjason commented Sep 27, 2021

joshdover commented Oct 4, 2021

joshdover commented Oct 5, 2021

hop-dev commented Oct 11, 2021

Decisions

Notes for future implementation

hop-dev commented Oct 15, 2021

joshdover commented Oct 18, 2021

mostlyjason commented Oct 21, 2021

bmorelli25 commented Oct 21, 2021

mostlyjason commented Aug 13, 2021 •

edited

Loading

hop-dev commented Sep 7, 2021 •

edited

Loading