-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test #196026
[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test #196026
Conversation
Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant) |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7134[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 25/25 tests passed. |
Flaky Test Runner Stats🎉 All tests passed! - kibana-flaky-test-suite-runner#7135[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 200/200 tests passed. |
fa6e12b
to
7021ce1
Compare
7021ce1
to
a0f9183
Compare
@@ -88,6 +88,7 @@ export default function ApiTest({ getService }: FtrProviderContext) { | |||
}); | |||
|
|||
await Promise.all(promises); | |||
await new Promise((resolve) => setTimeout(resolve, 500)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we can avoid sleeps - can you reproduce the flakiness? usually things like this are related to Elasticsearch not having refreshed the shard yet. Can you figure out if we are waiting for a refresh in the endpoint? (Look for refresh: 'wait_for'
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dgieselaar
Thank you for the review.
I wasn't able to reproduce the flakiness.. I tried reproducing locally and via the flaky test runner, but all tests passed.
In the code for these tests, I don't see refresh: 'wait_for'
This is the output I get after locally running the tests:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viduni94 , Dario was referring to checking whether the actual endpoint we're using querys the knowledge base with refresh: wait_for
. We can see that it is, here https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/knowledge_base_service/index.ts#L662.
Looking at the docs , it says:
Never start multiple refresh=wait_for requests in a row. Instead batch them into a single bulk request with refresh=wait_for and Elasticsearch will start them all in parallel and return only when they have all finished.
Perhaps this could be causing some flakiness. Bulk indexing would be more reliable. I was thinking we can instead use POST /internal/observability_ai_assistant/kb/entries/import
when we're wanting to index multiple documents for testing, but it looks like that this not actually bulk indexing but creating multiple requests with concurrency control.
Perhaps for now we can use the retry
service but think about improving the API to bulk requests. CC @dgieselaar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @neptunian
I've updated the PR with the suggested changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the misunderstanding but when I said to perhaps try a retry, I meant to retry fetching them if one of the entries was missing, not to try adding the entries again. The idea behind that was because they are not yet available due to the multiple requests potentially. Since we didn't get an error when indexing them to begin with, I'm not sure how it would help to add them again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @neptunian
I've updated the PR.
b3895c4
to
941dddd
Compare
3efcc81
to
479ebea
Compare
…the tests to avoid any data inconsistency (elastic#192222)
f513303
to
7e6094f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retries are always a bandaid to another problem but if they do the job realiably I'm good with this approach for now
|
||
const instructions = res.body.userInstructions; | ||
|
||
const sortByDocId = (data: any) => sortBy(data, 'doc_id'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid any
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I noticed this is not your change, but imho we can very easily not use any
here, so hopefully that is a small change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure @dgieselaar
I updated it.
@@ -91,59 +92,64 @@ export default function ApiTest({ getService }: FtrProviderContext) { | |||
}); | |||
|
|||
it('"editor" can retrieve their own private instructions and the public instruction', async () => { | |||
const res = await observabilityAIAssistantAPIClient.editorUser({ | |||
endpoint: 'GET /internal/observability_ai_assistant/kb/user_instructions', | |||
await retry.try(async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the retry
call here and re-run the flaky tests? If we can do without it, that would be my preference, otherwise we'll end up chasing flakiness everywhere (there is already a retry at the test suite level, for instance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @dgieselaar
Apologies if this is a dumb question.
How do I re-run the flaky tests? - do you mean to use the flaky test runner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll send you a link!
💚 Build Succeeded
Metrics [docs]
History
cc @viduni94 |
Starting backport for target branches: 8.x https://github.com/elastic/kibana/actions/runs/11618429376 |
…ructions test (elastic#196026) Closes elastic#192222 ## Summary ### Problem The "when creating private and public user instructions" test has been marked as flaky and has been skipped. Based on the error recorded in the ticket, 2 possible scenarios could be - **Race Conditions**: When multiple instructions are created asynchronously, the instructions might not be assigned to the right user or role. Data could be overwritten. - **Data Fetching Issues**: The API might return inconsistent data if the knowledge base is not properly cleared between tests, or if the instructions are not properly isolated per user. ### Solution When running the test locally, the actual output and expected outcome are the same, therefore the test passes. The flaky test runner didn't output anything meaningful either. However, in order to resolve any missing entries, the before hook was updated to retry adding only the missing entries. Hopefully, this will help resolve the flakiness. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed (cherry picked from commit a7a09f7)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…r instructions test (#196026) (#198612) # Backport This will backport the following commits from `main` to `8.x`: - [[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test (#196026)](#196026) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Viduni Wickramarachchi","email":"viduni.wickramarachchi@elastic.co"},"sourceCommit":{"committedDate":"2024-10-31T19:17:08Z","message":"[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test (#196026)\n\nCloses https://github.com/elastic/kibana/issues/192222\r\n\r\n## Summary\r\n\r\n### Problem\r\nThe \"when creating private and public user instructions\" test has been\r\nmarked as flaky and has been skipped.\r\nBased on the error recorded in the ticket, 2 possible scenarios could be\r\n\r\n- **Race Conditions**: When multiple instructions are created\r\nasynchronously, the instructions might not be assigned to the right user\r\nor role. Data could be overwritten.\r\n- **Data Fetching Issues**: The API might return inconsistent data if\r\nthe knowledge base is not properly cleared between tests, or if the\r\ninstructions are not properly isolated per user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual output and expected outcome\r\nare the same, therefore the test passes. The flaky test runner didn't\r\noutput anything meaningful either.\r\n\r\nHowever, in order to resolve any missing entries, the before hook was\r\nupdated to retry adding only the missing entries. Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix","v9.0.0","backport:prev-minor","Team:Obs AI Assistant","ci:project-deploy-observability"],"title":"[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test","number":196026,"url":"https://github.com/elastic/kibana/pull/196026","mergeCommit":{"message":"[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test (#196026)\n\nCloses https://github.com/elastic/kibana/issues/192222\r\n\r\n## Summary\r\n\r\n### Problem\r\nThe \"when creating private and public user instructions\" test has been\r\nmarked as flaky and has been skipped.\r\nBased on the error recorded in the ticket, 2 possible scenarios could be\r\n\r\n- **Race Conditions**: When multiple instructions are created\r\nasynchronously, the instructions might not be assigned to the right user\r\nor role. Data could be overwritten.\r\n- **Data Fetching Issues**: The API might return inconsistent data if\r\nthe knowledge base is not properly cleared between tests, or if the\r\ninstructions are not properly isolated per user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual output and expected outcome\r\nare the same, therefore the test passes. The flaky test runner didn't\r\noutput anything meaningful either.\r\n\r\nHowever, in order to resolve any missing entries, the before hook was\r\nupdated to retry adding only the missing entries. Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196026","number":196026,"mergeCommit":{"message":"[Obs AI Assistant] Attempt to resolve flaky knowledge based user instructions test (#196026)\n\nCloses https://github.com/elastic/kibana/issues/192222\r\n\r\n## Summary\r\n\r\n### Problem\r\nThe \"when creating private and public user instructions\" test has been\r\nmarked as flaky and has been skipped.\r\nBased on the error recorded in the ticket, 2 possible scenarios could be\r\n\r\n- **Race Conditions**: When multiple instructions are created\r\nasynchronously, the instructions might not be assigned to the right user\r\nor role. Data could be overwritten.\r\n- **Data Fetching Issues**: The API might return inconsistent data if\r\nthe knowledge base is not properly cleared between tests, or if the\r\ninstructions are not properly isolated per user.\r\n\r\n### Solution\r\nWhen running the test locally, the actual output and expected outcome\r\nare the same, therefore the test passes. The flaky test runner didn't\r\noutput anything meaningful either.\r\n\r\nHowever, in order to resolve any missing entries, the before hook was\r\nupdated to retry adding only the missing entries. Hopefully, this will\r\nhelp resolve the flakiness.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed","sha":"a7a09f798f8ab68373ee2e15dec01ef917ff4e07"}}]}] BACKPORT--> Co-authored-by: Viduni Wickramarachchi <viduni.wickramarachchi@elastic.co>
…ructions test (elastic#196026) Closes elastic#192222 ## Summary ### Problem The "when creating private and public user instructions" test has been marked as flaky and has been skipped. Based on the error recorded in the ticket, 2 possible scenarios could be - **Race Conditions**: When multiple instructions are created asynchronously, the instructions might not be assigned to the right user or role. Data could be overwritten. - **Data Fetching Issues**: The API might return inconsistent data if the knowledge base is not properly cleared between tests, or if the instructions are not properly isolated per user. ### Solution When running the test locally, the actual output and expected outcome are the same, therefore the test passes. The flaky test runner didn't output anything meaningful either. However, in order to resolve any missing entries, the before hook was updated to retry adding only the missing entries. Hopefully, this will help resolve the flakiness. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
…y tests (#200022) Closes #192222 ## Summary ### Problem The test appears to be flaky, potentially because the entries are not available at the time of retrieval. This cannot be reproduced locally or via the flaky test runner. (more details [here](#196026 (comment))) ### Solution Add a retry when fetching the instructions and check whether the number of instructions returned by the API endpoint is the same number of instructions expected. ### Checklist - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
…y tests (elastic#200022) Closes elastic#192222 ## Summary ### Problem The test appears to be flaky, potentially because the entries are not available at the time of retrieval. This cannot be reproduced locally or via the flaky test runner. (more details [here](elastic#196026 (comment))) ### Solution Add a retry when fetching the instructions and check whether the number of instructions returned by the API endpoint is the same number of instructions expected. ### Checklist - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed (cherry picked from commit 53c05a3)
…y tests (elastic#200022) Closes elastic#192222 ## Summary ### Problem The test appears to be flaky, potentially because the entries are not available at the time of retrieval. This cannot be reproduced locally or via the flaky test runner. (more details [here](elastic#196026 (comment))) ### Solution Add a retry when fetching the instructions and check whether the number of instructions returned by the API endpoint is the same number of instructions expected. ### Checklist - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
…y tests (elastic#200022) Closes elastic#192222 ## Summary ### Problem The test appears to be flaky, potentially because the entries are not available at the time of retrieval. This cannot be reproduced locally or via the flaky test runner. (more details [here](elastic#196026 (comment))) ### Solution Add a retry when fetching the instructions and check whether the number of instructions returned by the API endpoint is the same number of instructions expected. ### Checklist - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
…y tests (elastic#200022) Closes elastic#192222 ## Summary ### Problem The test appears to be flaky, potentially because the entries are not available at the time of retrieval. This cannot be reproduced locally or via the flaky test runner. (more details [here](elastic#196026 (comment))) ### Solution Add a retry when fetching the instructions and check whether the number of instructions returned by the API endpoint is the same number of instructions expected. ### Checklist - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed
Closes #192222
Summary
Problem
The "when creating private and public user instructions" test has been marked as flaky and has been skipped.
Based on the error recorded in the ticket, 2 possible scenarios could be
Solution
When running the test locally, the actual output and expected outcome are the same, therefore the test passes. The flaky test runner didn't output anything meaningful either.
However, in order to resolve any missing entries, the before hook was updated to retry adding only the missing entries. Hopefully, this will help resolve the flakiness.
Checklist