Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test #196026

Conversation

viduni94
Copy link
Contributor

@viduni94 viduni94 commented Oct 13, 2024

Closes #192222

Summary

Problem

The "when creating private and public user instructions" test has been marked as flaky and has been skipped.
Based on the error recorded in the ticket, 2 possible scenarios could be

  • Race Conditions: When multiple instructions are created asynchronously, the instructions might not be assigned to the right user or role. Data could be overwritten.
  • Data Fetching Issues: The API might return inconsistent data if the knowledge base is not properly cleared between tests, or if the instructions are not properly isolated per user.

Solution

When running the test locally, the actual output and expected outcome are the same, therefore the test passes. The flaky test runner didn't output anything meaningful either.

However, in order to resolve any missing entries, the before hook was updated to retry adding only the missing entries. Hopefully, this will help resolve the flakiness.

Checklist

@viduni94 viduni94 self-assigned this Oct 13, 2024
@viduni94 viduni94 requested a review from a team as a code owner October 13, 2024 14:44
@botelastic botelastic bot added ci:project-deploy-observability Create an Observability project Team:Obs AI Assistant Observability AI Assistant labels Oct 13, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant)

@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7134

[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 25/25 tests passed.

see run history

@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7135

[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 200/200 tests passed.

see run history

@viduni94 viduni94 added release_note:fix backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) labels Oct 14, 2024
@viduni94 viduni94 force-pushed the resolve-flaky-obs-ai-assistant-test-knowledge-based-user-instructions branch from fa6e12b to 7021ce1 Compare October 14, 2024 13:17
@viduni94 viduni94 changed the title [Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test (#192222) [Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test Oct 15, 2024
@viduni94 viduni94 force-pushed the resolve-flaky-obs-ai-assistant-test-knowledge-based-user-instructions branch from 7021ce1 to a0f9183 Compare October 16, 2024 22:01
@@ -88,6 +88,7 @@ export default function ApiTest({ getService }: FtrProviderContext) {
});

await Promise.all(promises);
await new Promise((resolve) => setTimeout(resolve, 500));
Copy link
Member

@dgieselaar dgieselaar Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we can avoid sleeps - can you reproduce the flakiness? usually things like this are related to Elasticsearch not having refreshed the shard yet. Can you figure out if we are waiting for a refresh in the endpoint? (Look for refresh: 'wait_for').

Copy link
Contributor Author

@viduni94 viduni94 Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dgieselaar
Thank you for the review.
I wasn't able to reproduce the flakiness.. I tried reproducing locally and via the flaky test runner, but all tests passed.

In the code for these tests, I don't see refresh: 'wait_for'

This is the output I get after locally running the tests:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viduni94 , Dario was referring to checking whether the actual endpoint we're using querys the knowledge base with refresh: wait_for. We can see that it is, here https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/knowledge_base_service/index.ts#L662.

Looking at the docs , it says:

Never start multiple refresh=wait_for requests in a row. Instead batch them into a single bulk request with refresh=wait_for and Elasticsearch will start them all in parallel and return only when they have all finished.

Perhaps this could be causing some flakiness. Bulk indexing would be more reliable. I was thinking we can instead use POST /internal/observability_ai_assistant/kb/entries/import when we're wanting to index multiple documents for testing, but it looks like that this not actually bulk indexing but creating multiple requests with concurrency control.

Perhaps for now we can use the retry service but think about improving the API to bulk requests. CC @dgieselaar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @neptunian
I've updated the PR with the suggested changes.

Copy link
Contributor

@neptunian neptunian Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the misunderstanding but when I said to perhaps try a retry, I meant to retry fetching them if one of the entries was missing, not to try adding the entries again. The idea behind that was because they are not yet available due to the multiple requests potentially. Since we didn't get an error when indexing them to begin with, I'm not sure how it would help to add them again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @neptunian
I've updated the PR.

@viduni94 viduni94 force-pushed the resolve-flaky-obs-ai-assistant-test-knowledge-based-user-instructions branch 3 times, most recently from b3895c4 to 941dddd Compare October 21, 2024 11:45
@viduni94 viduni94 requested a review from dgieselaar October 21, 2024 20:44
@viduni94 viduni94 force-pushed the resolve-flaky-obs-ai-assistant-test-knowledge-based-user-instructions branch 2 times, most recently from 3efcc81 to 479ebea Compare October 25, 2024 20:01
@viduni94 viduni94 requested a review from neptunian October 26, 2024 01:08
@viduni94 viduni94 force-pushed the resolve-flaky-obs-ai-assistant-test-knowledge-based-user-instructions branch from f513303 to 7e6094f Compare October 30, 2024 12:36
Copy link
Member

@sorenlouv sorenlouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retries are always a bandaid to another problem but if they do the job realiably I'm good with this approach for now


const instructions = res.body.userInstructions;

const sortByDocId = (data: any) => sortBy(data, 'doc_id');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid any here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I noticed this is not your change, but imho we can very easily not use any here, so hopefully that is a small change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @dgieselaar
I updated it.

@@ -91,59 +92,64 @@ export default function ApiTest({ getService }: FtrProviderContext) {
});

it('"editor" can retrieve their own private instructions and the public instruction', async () => {
const res = await observabilityAIAssistantAPIClient.editorUser({
endpoint: 'GET /internal/observability_ai_assistant/kb/user_instructions',
await retry.try(async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the retry call here and re-run the flaky tests? If we can do without it, that would be my preference, otherwise we'll end up chasing flakiness everywhere (there is already a retry at the test suite level, for instance).

Copy link
Contributor Author

@viduni94 viduni94 Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @dgieselaar

Apologies if this is a dumb question.
How do I re-run the flaky tests? - do you mean to use the flaky test runner?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll send you a link!

@viduni94 viduni94 requested a review from dgieselaar October 31, 2024 16:41
@elasticmachine
Copy link
Contributor

elasticmachine commented Oct 31, 2024

💚 Build Succeeded

  • Buildkite Build
  • Commit: 833db6d
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-196026-833db6d25879

Metrics [docs]

✅ unchanged

History

cc @viduni94

@viduni94 viduni94 merged commit a7a09f7 into elastic:main Oct 31, 2024
26 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/11618429376

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 31, 2024
…ructions test (elastic#196026)

Closes elastic#192222

## Summary

### Problem
The "when creating private and public user instructions" test has been
marked as flaky and has been skipped.
Based on the error recorded in the ticket, 2 possible scenarios could be

- **Race Conditions**: When multiple instructions are created
asynchronously, the instructions might not be assigned to the right user
or role. Data could be overwritten.
- **Data Fetching Issues**: The API might return inconsistent data if
the knowledge base is not properly cleared between tests, or if the
instructions are not properly isolated per user.

### Solution
When running the test locally, the actual output and expected outcome
are the same, therefore the test passes. The flaky test runner didn't
output anything meaningful either.

However, in order to resolve any missing entries, the before hook was
updated to retry adding only the missing entries. Hopefully, this will
help resolve the flakiness.

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

(cherry picked from commit a7a09f7)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Oct 31, 2024
…r instructions test (#196026) (#198612)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Obs AI Assistant] Attempt to resolve flaky knowledge based user
instructions test
(#196026)](#196026)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Viduni
Wickramarachchi","email":"viduni.wickramarachchi@elastic.co"},"sourceCommit":{"committedDate":"2024-10-31T19:17:08Z","message":"[Obs
AI Assistant] Attempt to resolve flaky knowledge based user instructions
test (#196026)\n\nCloses
https://github.com/elastic/kibana/issues/192222\r\n\r\n##
Summary\r\n\r\n### Problem\r\nThe \"when creating private and public
user instructions\" test has been\r\nmarked as flaky and has been
skipped.\r\nBased on the error recorded in the ticket, 2 possible
scenarios could be\r\n\r\n- **Race Conditions**: When multiple
instructions are created\r\nasynchronously, the instructions might not
be assigned to the right user\r\nor role. Data could be
overwritten.\r\n- **Data Fetching Issues**: The API might return
inconsistent data if\r\nthe knowledge base is not properly cleared
between tests, or if the\r\ninstructions are not properly isolated per
user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual
output and expected outcome\r\nare the same, therefore the test passes.
The flaky test runner didn't\r\noutput anything meaningful
either.\r\n\r\nHowever, in order to resolve any missing entries, the
before hook was\r\nupdated to retry adding only the missing entries.
Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n###
Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","v9.0.0","backport:prev-minor","Team:Obs
AI Assistant","ci:project-deploy-observability"],"title":"[Obs AI
Assistant] Attempt to resolve flaky knowledge based user instructions
test","number":196026,"url":"https://github.com/elastic/kibana/pull/196026","mergeCommit":{"message":"[Obs
AI Assistant] Attempt to resolve flaky knowledge based user instructions
test (#196026)\n\nCloses
https://github.com/elastic/kibana/issues/192222\r\n\r\n##
Summary\r\n\r\n### Problem\r\nThe \"when creating private and public
user instructions\" test has been\r\nmarked as flaky and has been
skipped.\r\nBased on the error recorded in the ticket, 2 possible
scenarios could be\r\n\r\n- **Race Conditions**: When multiple
instructions are created\r\nasynchronously, the instructions might not
be assigned to the right user\r\nor role. Data could be
overwritten.\r\n- **Data Fetching Issues**: The API might return
inconsistent data if\r\nthe knowledge base is not properly cleared
between tests, or if the\r\ninstructions are not properly isolated per
user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual
output and expected outcome\r\nare the same, therefore the test passes.
The flaky test runner didn't\r\noutput anything meaningful
either.\r\n\r\nHowever, in order to resolve any missing entries, the
before hook was\r\nupdated to retry adding only the missing entries.
Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n###
Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196026","number":196026,"mergeCommit":{"message":"[Obs
AI Assistant] Attempt to resolve flaky knowledge based user instructions
test (#196026)\n\nCloses
https://github.com/elastic/kibana/issues/192222\r\n\r\n##
Summary\r\n\r\n### Problem\r\nThe \"when creating private and public
user instructions\" test has been\r\nmarked as flaky and has been
skipped.\r\nBased on the error recorded in the ticket, 2 possible
scenarios could be\r\n\r\n- **Race Conditions**: When multiple
instructions are created\r\nasynchronously, the instructions might not
be assigned to the right user\r\nor role. Data could be
overwritten.\r\n- **Data Fetching Issues**: The API might return
inconsistent data if\r\nthe knowledge base is not properly cleared
between tests, or if the\r\ninstructions are not properly isolated per
user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual
output and expected outcome\r\nare the same, therefore the test passes.
The flaky test runner didn't\r\noutput anything meaningful
either.\r\n\r\nHowever, in order to resolve any missing entries, the
before hook was\r\nupdated to retry adding only the missing entries.
Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n###
Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07"}}]}]
BACKPORT-->

Co-authored-by: Viduni Wickramarachchi <viduni.wickramarachchi@elastic.co>
nreese pushed a commit to nreese/kibana that referenced this pull request Nov 1, 2024
…ructions test (elastic#196026)

Closes elastic#192222

## Summary

### Problem
The "when creating private and public user instructions" test has been
marked as flaky and has been skipped.
Based on the error recorded in the ticket, 2 possible scenarios could be

- **Race Conditions**: When multiple instructions are created
asynchronously, the instructions might not be assigned to the right user
or role. Data could be overwritten.
- **Data Fetching Issues**: The API might return inconsistent data if
the knowledge base is not properly cleared between tests, or if the
instructions are not properly isolated per user.

### Solution
When running the test locally, the actual output and expected outcome
are the same, therefore the test passes. The flaky test runner didn't
output anything meaningful either.

However, in order to resolve any missing entries, the before hook was
updated to retry adding only the missing entries. Hopefully, this will
help resolve the flakiness.

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
viduni94 added a commit that referenced this pull request Nov 14, 2024
…y tests (#200022)

Closes #192222

## Summary

### Problem
The test appears to be flaky, potentially because the entries are not
available at the time of retrieval. This cannot be reproduced locally or
via the flaky test runner. (more details
[here](#196026 (comment)))

### Solution
Add a retry when fetching the instructions and check whether the number
of instructions returned by the API endpoint is the same number of
instructions expected.

### Checklist

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Nov 14, 2024
…y tests (elastic#200022)

Closes elastic#192222

## Summary

### Problem
The test appears to be flaky, potentially because the entries are not
available at the time of retrieval. This cannot be reproduced locally or
via the flaky test runner. (more details
[here](elastic#196026 (comment)))

### Solution
Add a retry when fetching the instructions and check whether the number
of instructions returned by the API endpoint is the same number of
instructions expected.

### Checklist

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed

(cherry picked from commit 53c05a3)
wayneseymour pushed a commit to wayneseymour/kibana that referenced this pull request Nov 18, 2024
…y tests (elastic#200022)

Closes elastic#192222

## Summary

### Problem
The test appears to be flaky, potentially because the entries are not
available at the time of retrieval. This cannot be reproduced locally or
via the flaky test runner. (more details
[here](elastic#196026 (comment)))

### Solution
Add a retry when fetching the instructions and check whether the number
of instructions returned by the API endpoint is the same number of
instructions expected.

### Checklist

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Nov 18, 2024
…y tests (elastic#200022)

Closes elastic#192222

## Summary

### Problem
The test appears to be flaky, potentially because the entries are not
available at the time of retrieval. This cannot be reproduced locally or
via the flaky test runner. (more details
[here](elastic#196026 (comment)))

### Solution
Add a retry when fetching the instructions and check whether the number
of instructions returned by the API endpoint is the same number of
instructions expected.

### Checklist

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Nov 18, 2024
…y tests (elastic#200022)

Closes elastic#192222

## Summary

### Problem
The test appears to be flaky, potentially because the entries are not
available at the time of retrieval. This cannot be reproduced locally or
via the flaky test runner. (more details
[here](elastic#196026 (comment)))

### Solution
Add a retry when fetching the instructions and check whether the number
of instructions returned by the API endpoint is the same number of
instructions expected.

### Checklist

- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment