Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset Quality] Provide field limit mitigation option with rollover #190330

Closed
3 tasks done
achyutjhunjhunwala opened this issue Aug 12, 2024 · 14 comments · Fixed by #195561
Closed
3 tasks done

[Dataset Quality] Provide field limit mitigation option with rollover #190330

achyutjhunjhunwala opened this issue Aug 12, 2024 · 14 comments · Fixed by #195561
Assignees
Labels
Feature:Dataset Health Team:obs-ux-logs Observability Logs User Experience Team

Comments

@achyutjhunjhunwala
Copy link
Contributor

achyutjhunjhunwala commented Aug 12, 2024

📓 Summary

Add UI to be able to update the Field Limit. Things to consider

  • Validations around how high the limit can be set, should the limit be proposed
  • Proper messages around, how they can fix this otherwise like reducing the number of fields sent.
  • Rollover the Datastream to see the affected changes.

Acceptance Criteria

  • Section to take user input to increase Field Limit as per Wireframes
  • The Degraded Fields table on the Main page should start referring to last backing index in the DS
  • Show user a message to Rollover if they have done this. (P.S. - Its not possible to preserve this state, meaning if the user did update the size and did not do a rollover, comes later, we will not be able to show this message)

🎨 Design

Figma

@achyutjhunjhunwala achyutjhunjhunwala added Team:obs-ux-logs Observability Logs User Experience Team Feature:Dataset Health labels Aug 12, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@achyutjhunjhunwala achyutjhunjhunwala changed the title [Dataset Quality] Add field limit increase mitigation [Dataset Quality] Determine field limit heuristics and provide increase mitigation Sep 2, 2024
@achyutjhunjhunwala achyutjhunjhunwala self-assigned this Sep 4, 2024
@achyutjhunjhunwala achyutjhunjhunwala changed the title [Dataset Quality] Determine field limit heuristics and provide increase mitigation [Dataset Quality] Provide field limit mitigation option with rollover Sep 10, 2024
@achyutjhunjhunwala
Copy link
Contributor Author

achyutjhunjhunwala commented Sep 25, 2024

cc: @gbamparop @LucaWintergerst @flash1293

As discussed during the weekly sync, updating the discussions here

  1. The option to apply a new limit will only be available to integrations. Non integration dataset will only have the text for field limit but not provide option to do it.
  2. After careful consideration, i propose we suggest Current Limit + 30% to the user for New Field Limit to start with.
  3. If the user selects a new limit and clicks on Apply the following will happen
  • Update the custom component template associated with the index template of the integration. If Custom component does not exist, same will be created.
  • Update the last backing index with the new setting as well.
    • If this part is successful, do nothing
    • if updating the last backing index fails, show the user an option to do Rollover. @patpscal Can you help me here with some mock up for triggering the Rollover. This would basically be a second action.
      @mdbirnstiehl I would need a copy as well for the text here. As this would be considered as partial success. That component template was successfully updated but not the last backing index. Hence a manual rollover is required

Image

Text i used for which i need copy is

Image

  1. One of the Mitigations is the link to update the Pipeline for Integrations. This should ideally take them to a page where they can either edit or create a @custom pipeline. This solution currently does not exists and a feature request has been created.

An alternate solution to this would be to display the user the name of the custom pipeline they should create with a copy icon next to it.
There should be a link which would take the users to the current create Pipeline Page.

A text must be presented to them, that they can copy the pipeline name and create a custom pipeline to handle this situation.
@mdbirnstiehl @patpscal Can you both help me here with this point 4 in terms of design and copy.

Image

achyutjhunjhunwala added a commit that referenced this issue Sep 26, 2024
## Summary

As part of this
[issue](#190330), we need to add
link to the Elasticsearch Mapping Settings Limit page here -
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html

Not to mix this change with my original PR, creating a separate PR here
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Sep 26, 2024
## Summary

As part of this
[issue](elastic#190330), we need to add
link to the Elasticsearch Mapping Settings Limit page here -
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html

Not to mix this change with my original PR, creating a separate PR here

(cherry picked from commit 7d667f3)
kibanamachine added a commit that referenced this issue Sep 26, 2024
…#194178)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Dataset Quality] Add new doc link for mapping limits
(#194156)](#194156)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Achyut
Jhunjhunwala","email":"achyut.jhunjhunwala@elastic.co"},"sourceCommit":{"committedDate":"2024-09-26T15:44:58Z","message":"[Dataset
Quality] Add new doc link for mapping limits (#194156)\n\n##
Summary\r\n\r\nAs part of
this\r\n[issue](#190330), we
need to add\r\nlink to the Elasticsearch Mapping Settings Limit page
here
-\r\nhttps://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html\r\n\r\nNot
to mix this change with my original PR, creating a separate PR
here","sha":"7d667f3c9d5ec26340906fcb6bbe34d889e78f74","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor"],"title":"[Dataset
Quality] Add new doc link for mapping
limits","number":194156,"url":"https://github.com/elastic/kibana/pull/194156","mergeCommit":{"message":"[Dataset
Quality] Add new doc link for mapping limits (#194156)\n\n##
Summary\r\n\r\nAs part of
this\r\n[issue](#190330), we
need to add\r\nlink to the Elasticsearch Mapping Settings Limit page
here
-\r\nhttps://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html\r\n\r\nNot
to mix this change with my original PR, creating a separate PR
here","sha":"7d667f3c9d5ec26340906fcb6bbe34d889e78f74"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/194156","number":194156,"mergeCommit":{"message":"[Dataset
Quality] Add new doc link for mapping limits (#194156)\n\n##
Summary\r\n\r\nAs part of
this\r\n[issue](#190330), we
need to add\r\nlink to the Elasticsearch Mapping Settings Limit page
here
-\r\nhttps://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html\r\n\r\nNot
to mix this change with my original PR, creating a separate PR
here","sha":"7d667f3c9d5ec26340906fcb6bbe34d889e78f74"}}]}] BACKPORT-->

Co-authored-by: Achyut Jhunjhunwala <achyut.jhunjhunwala@elastic.co>
@mdbirnstiehl
Copy link
Contributor

mdbirnstiehl commented Oct 8, 2024

@mdbirnstiehl I would need a copy as well for the text here. As this would be considered as partial success. That component template was successfully updated but not the last backing index. Hence a manual rollover is required

How about:

Changes not applied to new data

The component template was successfully updated with the new field limit, but the changes were not applied to the most recent backing index.
Perform a rollover to apply your changes to new data.

  1. One of the Mitigations is the link to update the Pipeline for Integrations. This should ideally take them to a page where they can either edit or create a @custom pipeline. This solution currently does not exists and a feature request has been created.

An alternate solution to this would be to display the user the name of the custom pipeline they should create with a copy icon next to it. There should be a link which would take the users to the current create Pipeline Page.

Do we have a design for this one? Not sure exactly what it will look like and what text we will need here.

@achyutjhunjhunwala
Copy link
Contributor Author

@mdbirnstiehl Thanks for the copy.

I would need one more copy from you based on the discussion we had today.

When a user fixes a Field Limit issue, we apply the fix on the current backing index as well. This means that the issue still exists and now when the user open the pop up, they would see our fallback Potential Cause which is called ignore malformed .

Hence in the call today @flash1293 had a great suggestion that we can display a text similar to this when ignore_malformed happened.

Image

@patpscal Can you help me and Mike here with a change in design as well.

Previously we were displaying this

Image
But now for the Pipeline part, we won't be displaying a Link, but as mentioned above, we will display the Pipeline name with a copy icon next to it and there will be some text stating that please go here and update the pipeline

@mdbirnstiehl
Copy link
Contributor

mdbirnstiehl commented Oct 9, 2024

I would need one more copy from you based on the discussion we had today.

When a user fixes a Field Limit issue, we apply the fix on the current backing index as well. This means that the issue still exists and now when the user open the pop up, they would see our fallback Potential Cause which is called ignore malformed .

@achyutjhunjhunwala Will this only apply for field limit changes? Can we be specific or do we want to avoid mentioning the specific settings?

How about something like this:

If you've recently updated your field limit settings, this quality issue may not be relevant. Rollover the data stream to verify.

@achyutjhunjhunwala
Copy link
Contributor Author

This makes more sense

@achyutjhunjhunwala
Copy link
Contributor Author

achyutjhunjhunwala commented Oct 10, 2024

@mdbirnstiehl As Patri is on PTO, i came with a rough design to explain what i meant

For Non Integrations we would recommend them a very generic custom pipeline based on datastreamType@custom

Image

For Integrations, we would recommend them with explicit custom pipeline name which is datastreamType-datastreamDataset@custom

Image

cc: @flash1293 @LucaWintergerst

@mdbirnstiehl
Copy link
Contributor

@achyutjhunjhunwala would the user be taken to the page to create a pipeline when they copy the name or would there be another link for them to follow?

In this scenario, would users always be creating the custom pipeline? If they are always creating the pipeline, it would make sense to make the title "Add custom ingest pipeline" and remove the edit.

Do we want to give a little summary of what/why the user would add a custom pipeline? I'm not sure that would be obvious to the user.

We could possibly organize it like this:

Add a custom ingest pipeline to...(I'm not sure of the mechanics of it, but we could give a reason here as to why they need to create the pipeline):

  1. Copy the following pipeline name:
    logs@custom
  2. [Create a custom pipeline](link to create pipeline page) using the name you copied.

@achyutjhunjhunwala
Copy link
Contributor Author

@mdbirnstiehl
When they click on that text with copy icon, at the moment i am only copying the pipeline name.
The Title is a link to the Pipeline page, where they can either search for that pipeline if it exists and edit.
If not, they can create it. We are only aiding them with the name of the pipeline here as it is important to have this particular naming convention.

Why the user need to modify a pipeline ?

At the moment we are only offering a fix it flow for Field Limit Issues that too only for integrations. Other issues like character limit reached or malformed values still need to be fixed manually by the user. Editing/Adding Pipeline processor is also one of the way

@achyutjhunjhunwala
Copy link
Contributor Author

@patpscal As you are back, can you help me with the UX for this Pipeline panel. I just randomly built this post @LucaWintergerst comment.

@LucaWintergerst
Copy link
Contributor

How hard would it be for us to check if the pipeline already exists?

GET _ingest/pipeline/<name>

if it exists, we display

[link icon] Edit ingest pipeline

if it does not exist, we do

[link icon] Create ingest pipeline
---
1. Copy the following pipeline name:
    [copy icon] logs@custom
2. [link icon] [Create a custom pipeline](link to create pipeline page) using the name you copied.

this would make for a cleaner UX as we'd have a nicer solution in the case that it's there already

if we can't easily check for the presence of the pipeline I'd suggest

[link icon] Create or edit ingest pipeline
---
1. Copy the following pipeline name:
    [copy icon] logs@custom
2. [link icon] [Create or edit the pipeline](link to create pipeline page) using the name you copied.

@achyutjhunjhunwala
Copy link
Contributor Author

@LucaWintergerst In favour of this issue - #183992
I would propose i got by the last option you provided where we don't check and display the copy and navigate option.

Reason being once the above ticket is implemented, we can always point to that URL and Management will automatically will take care of opening Edit or Create page.

@LucaWintergerst
Copy link
Contributor

Ok, that makes sense to me

@patpscal
Copy link

patpscal commented Oct 17, 2024

It seems the original design did not account for the intermediate "copy" step, so it links directly to the create pipeline page, hence the external link icon. We can disregard the link for this card title and use an accordion instead for the content, as in the Increase field mapping limit card.

Image
A variant to highlight the action/link:
Image

Figma link

achyutjhunjhunwala added a commit that referenced this issue Oct 25, 2024
## Summary

Closes - #190330

This PR implements the logic to support 

- One click increasing of Field Limit for Field Limit Issues (applicable
on for Integrations). For Non Integrations, only text is displayed as to
how they can do it.
- The One click increase updates the linked custom component template as
well as the last backing Index
- If Last Backing Index update fails due to any reason, it provides user
an option to trigger a Rollover manually.

## Demo

Not possible, to many things to display 😆 

## What's Pending ?

Tests

- [x] API tests
    - [x] Settings API
    - [x] Rollover API
    - [x] Apply New limit API
- [x] FTR tests
- [x] Displaying of various issues for integrations and non integrations
    - [x] Fix it Flow Good case, without Rollover
    - [x] Fix it Flow Good case, with Rollover
- [x] Manual Mitigation - Click on Component Template shold navigate to
proper logic based on Integration / Non
    - [x] Manual Mitigation - Ingest Pipeline
    - [x] Link for official Documentation
    
 ## How to setup a local environment
 
We will be setting up 2 different data streams, one with integration and
one without. Please follow the steps in the exact order
 
1. Start Local ES and Local Kibana
2. Install Nginx Integration 1st
3. Ingest data as per script here -
https://gist.github.com/achyutjhunjhunwala/03ea29190c6594544f584d2f0efa71e5
4. Set the Limit for the 2 datasets

```
PUT logs-synth.3-default/_settings
{
    "mapping.total_fields.limit": 36
}

// Set the limit for Nginx
PUT logs-nginx.access-default/_settings
{
    "mapping.total_fields.limit": 52
}
```

5. Now uncomment line number 59 from the synthtrace script to enable
cloud.project.id field and run the scenario again
6. Do a Rollover

```
POST logs-synth.3-default/_rollover
POST logs-nginx.access-default/_rollover
```

7. Get last backing index for both dataset

```
GET _data_stream/logs-synth.3-default/
GET _data_stream/logs-nginx.access-default
```

8. Increase the Limit by 1 but for last backing index

```
PUT .ds-logs-synth.3-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 37
}

PUT .ds-logs-nginx.access-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 53
}
```

9. Run the same Synthtrace scenario again.

This setup will give you 3 fields for testings

1. cloud.availability_zone - Which will show the character limit isue
2. cloud.project - Which will show an obsolete error which happened in
the past and now does not exists due to field limit
3. cloud.project.id - A current field limit issue

---------

Co-authored-by: Marco Antonio Ghiani <marcoantonio.ghiani01@gmail.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
achyutjhunjhunwala added a commit to achyutjhunjhunwala/kibana that referenced this issue Oct 25, 2024
## Summary

Closes - elastic#190330

This PR implements the logic to support

- One click increasing of Field Limit for Field Limit Issues (applicable
on for Integrations). For Non Integrations, only text is displayed as to
how they can do it.
- The One click increase updates the linked custom component template as
well as the last backing Index
- If Last Backing Index update fails due to any reason, it provides user
an option to trigger a Rollover manually.

## Demo

Not possible, to many things to display 😆

## What's Pending ?

Tests

- [x] API tests
    - [x] Settings API
    - [x] Rollover API
    - [x] Apply New limit API
- [x] FTR tests
- [x] Displaying of various issues for integrations and non integrations
    - [x] Fix it Flow Good case, without Rollover
    - [x] Fix it Flow Good case, with Rollover
- [x] Manual Mitigation - Click on Component Template shold navigate to
proper logic based on Integration / Non
    - [x] Manual Mitigation - Ingest Pipeline
    - [x] Link for official Documentation

 ## How to setup a local environment

We will be setting up 2 different data streams, one with integration and
one without. Please follow the steps in the exact order

1. Start Local ES and Local Kibana
2. Install Nginx Integration 1st
3. Ingest data as per script here -
https://gist.github.com/achyutjhunjhunwala/03ea29190c6594544f584d2f0efa71e5
4. Set the Limit for the 2 datasets

```
PUT logs-synth.3-default/_settings
{
    "mapping.total_fields.limit": 36
}

// Set the limit for Nginx
PUT logs-nginx.access-default/_settings
{
    "mapping.total_fields.limit": 52
}
```

5. Now uncomment line number 59 from the synthtrace script to enable
cloud.project.id field and run the scenario again
6. Do a Rollover

```
POST logs-synth.3-default/_rollover
POST logs-nginx.access-default/_rollover
```

7. Get last backing index for both dataset

```
GET _data_stream/logs-synth.3-default/
GET _data_stream/logs-nginx.access-default
```

8. Increase the Limit by 1 but for last backing index

```
PUT .ds-logs-synth.3-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 37
}

PUT .ds-logs-nginx.access-default-2024.10.10-000002/_settings
{
    "mapping.total_fields.limit": 53
}
```

9. Run the same Synthtrace scenario again.

This setup will give you 3 fields for testings

1. cloud.availability_zone - Which will show the character limit isue
2. cloud.project - Which will show an obsolete error which happened in
the past and now does not exists due to field limit
3. cloud.project.id - A current field limit issue

---------

Co-authored-by: Marco Antonio Ghiani <marcoantonio.ghiani01@gmail.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit 3ece950)

# Conflicts:
#	x-pack/test_serverless/functional/test_suites/observability/dataset_quality/degraded_field_flyout.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Dataset Health Team:obs-ux-logs Observability Logs User Experience Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants