-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing test: Detection Engine API Integration Tests - Serverless - Rule Execution Logic.x-pack/test/security_solution_api_integration/test_suites/detections_response/default_license/rule_execution_logic/execution_logic/machine_learning·ts - Rule execution logic API Execution logic @ess @serverless Machine learning type rules "before all" hook for "should create 1 alert from ML rule when record meets anomaly_threshold" #171426
Comments
New failure: CI Build - main |
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
New failure: CI Build - main |
Skipped. main: 27d2fb9 |
Pinging @elastic/security-solution (Team: SecuritySolution) |
Confirmed passing locally - flake not bug. |
The error in these builds is pretty useful: it looks like the
I didn't see any errors leading up to that failure indicating that the archive mappings (which explicitly include |
These were skipped previously due to es_archiver failing on a mapping error (CI flake), but upon unskipping it was discovered that there were a few mistakes in these tests (as they had been modified while skipped). The previous commit addressed an error related to error classification, this similarly fixes an assertion that would never have been true: the anomaly here only has a known `user.name`, and their `host.name` doesn't have a corresponding criticality record. Closes elastic#171426.
…ts (#181918) ## Summary These tests were skipped previously due to `es_archiver` [failing](#171426) on a mapping error, but upon unskipping it was discovered that there were a few mistakes in these tests, as they had been modified while skipped. There are three main changes here: * Fixes an incorrect assertion related to error classification * Fixes an incorrect assertion related to asset criticality enrichment * Adds additional `afterEach` hooks for housekeeping of generated data Closes #171426
FYI I'm continuing to investigate these failures on this PR |
## Summary The full chronicle of this endeavor can be found [here](#182183), but [this comment](#182183 (comment)) summarizes the identified issue: > I [finally found](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6516#01909dde-a3e8-4e47-b255-b1ff7cac8f8d/6-2368) the cause of these failures in the response to our "setup modules" request to ML. Attaching here for posterity: > > <details> > <summary>Setup Modules Failure Response</summary> > > ```json > { > "jobs": [ > { "id": "v3_linux_anomalous_network_port_activity", "success": true }, > { > "id": "v3_linux_anomalous_network_activity", > "success": false, > "error": { > "error": { > "root_cause": [ > { > "type": "no_shard_available_action_exception", > "reason": "[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]" > } > ], > "type": "search_phase_execution_exception", > "reason": "all shards failed", > "phase": "query", > "grouped": true, > "failed_shards": [ > { > "shard": 0, > "index": ".ml-anomalies-custom-v3_linux_network_configuration_discovery", > "node": "dKzpvp06ScO0OxqHilETEA", > "reason": { > "type": "no_shard_available_action_exception", > "reason": "[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]" > } > } > ] > }, > "status": 503 > } > } > ], > "datafeeds": [ > { > "id": "datafeed-v3_linux_anomalous_network_port_activity", > "success": true, > "started": false, > "awaitingMlNodeAllocation": false > }, > { > "id": "datafeed-v3_linux_anomalous_network_activity", > "success": false, > "started": false, > "awaitingMlNodeAllocation": false, > "error": { > "error": { > "root_cause": [ > { > "type": "resource_not_found_exception", > "reason": "No known job with id 'v3_linux_anomalous_network_activity'" > } > ], > "type": "resource_not_found_exception", > "reason": "No known job with id 'v3_linux_anomalous_network_activity'" > }, > "status": 404 > } > } > ], > "kibana": {} > } > > ``` > </details> This branch, then, fixes said issue by (relatively simply) retrying the failed API call until it succeeds. ### Related Issues Addresses: - #171426 - #187478 - #187614 - #182009 - #171426 ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] [ESS Rule Execution FTR x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6528) - [x] [Serverless Rule Execution FTR x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6529) ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Closed by #188155. |
## Summary The full chronicle of this endeavor can be found [here](elastic#182183), but [this comment](elastic#182183 (comment)) summarizes the identified issue: > I [finally found](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6516#01909dde-a3e8-4e47-b255-b1ff7cac8f8d/6-2368) the cause of these failures in the response to our "setup modules" request to ML. Attaching here for posterity: > > <details> > <summary>Setup Modules Failure Response</summary> > > ```json > { > "jobs": [ > { "id": "v3_linux_anomalous_network_port_activity", "success": true }, > { > "id": "v3_linux_anomalous_network_activity", > "success": false, > "error": { > "error": { > "root_cause": [ > { > "type": "no_shard_available_action_exception", > "reason": "[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]" > } > ], > "type": "search_phase_execution_exception", > "reason": "all shards failed", > "phase": "query", > "grouped": true, > "failed_shards": [ > { > "shard": 0, > "index": ".ml-anomalies-custom-v3_linux_network_configuration_discovery", > "node": "dKzpvp06ScO0OxqHilETEA", > "reason": { > "type": "no_shard_available_action_exception", > "reason": "[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]" > } > } > ] > }, > "status": 503 > } > } > ], > "datafeeds": [ > { > "id": "datafeed-v3_linux_anomalous_network_port_activity", > "success": true, > "started": false, > "awaitingMlNodeAllocation": false > }, > { > "id": "datafeed-v3_linux_anomalous_network_activity", > "success": false, > "started": false, > "awaitingMlNodeAllocation": false, > "error": { > "error": { > "root_cause": [ > { > "type": "resource_not_found_exception", > "reason": "No known job with id 'v3_linux_anomalous_network_activity'" > } > ], > "type": "resource_not_found_exception", > "reason": "No known job with id 'v3_linux_anomalous_network_activity'" > }, > "status": 404 > } > } > ], > "kibana": {} > } > > ``` > </details> This branch, then, fixes said issue by (relatively simply) retrying the failed API call until it succeeds. ### Related Issues Addresses: - elastic#171426 - elastic#187478 - elastic#187614 - elastic#182009 - elastic#171426 ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] [ESS Rule Execution FTR x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6528) - [x] [Serverless Rule Execution FTR x 200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6529) ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) (cherry picked from commit 3df635e)
… (#188259) # Backport This will backport the following commits from `main` to `8.15`: - [[Detection Engine] Addresses Flakiness in ML FTR tests (#188155)](#188155) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ryland Herrick","email":"ryalnd@gmail.com"},"sourceCommit":{"committedDate":"2024-07-12T19:10:25Z","message":"[Detection Engine] Addresses Flakiness in ML FTR tests (#188155)\n\n## Summary\r\n\r\nThe full chronicle of this endeavor can be found\r\n[here](#182183), but [this\r\ncomment](https://github.com/elastic/kibana/pull/182183#issuecomment-2221517519)\r\nsummarizes the identified issue:\r\n\r\n> I [finally\r\nfound](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6516#01909dde-a3e8-4e47-b255-b1ff7cac8f8d/6-2368)\r\nthe cause of these failures in the response to our \"setup modules\"\r\nrequest to ML. Attaching here for posterity:\r\n>\r\n> <details>\r\n> <summary>Setup Modules Failure Response</summary>\r\n> \r\n> ```json\r\n> {\r\n> \"jobs\": [\r\n> { \"id\": \"v3_linux_anomalous_network_port_activity\", \"success\": true },\r\n> {\r\n> \"id\": \"v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> ],\r\n> \"type\": \"search_phase_execution_exception\",\r\n> \"reason\": \"all shards failed\",\r\n> \"phase\": \"query\",\r\n> \"grouped\": true,\r\n> \"failed_shards\": [\r\n> {\r\n> \"shard\": 0,\r\n> \"index\":\r\n\".ml-anomalies-custom-v3_linux_network_configuration_discovery\",\r\n> \"node\": \"dKzpvp06ScO0OxqHilETEA\",\r\n> \"reason\": {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> }\r\n> ]\r\n> },\r\n> \"status\": 503\r\n> }\r\n> }\r\n> ],\r\n> \"datafeeds\": [\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_port_activity\",\r\n> \"success\": true,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false\r\n> },\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> }\r\n> ],\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> },\r\n> \"status\": 404\r\n> }\r\n> }\r\n> ],\r\n> \"kibana\": {}\r\n> }\r\n> \r\n> ```\r\n> </details>\r\n\r\nThis branch, then, fixes said issue by (relatively simply) retrying the\r\nfailed API call until it succeeds.\r\n\r\n### Related Issues\r\nAddresses:\r\n- https://github.com/elastic/kibana/issues/171426\r\n- https://github.com/elastic/kibana/issues/187478\r\n- https://github.com/elastic/kibana/issues/187614\r\n- https://github.com/elastic/kibana/issues/182009\r\n- https://github.com/elastic/kibana/issues/171426\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n- [x] [ESS Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6528)\r\n- [x] [Serverless Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6529)\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"3df635ef4a8c86c41c91ac5f59198a9b67d1dc8b","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:skip","Feature:Detection Rules","Feature:ML Rule","Feature:Security ML Jobs","Feature:Rule Creation","Team:Detection Engine","Feature:Rule Edit","v8.16.0"],"number":188155,"url":"https://github.com/elastic/kibana/pull/188155","mergeCommit":{"message":"[Detection Engine] Addresses Flakiness in ML FTR tests (#188155)\n\n## Summary\r\n\r\nThe full chronicle of this endeavor can be found\r\n[here](#182183), but [this\r\ncomment](https://github.com/elastic/kibana/pull/182183#issuecomment-2221517519)\r\nsummarizes the identified issue:\r\n\r\n> I [finally\r\nfound](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6516#01909dde-a3e8-4e47-b255-b1ff7cac8f8d/6-2368)\r\nthe cause of these failures in the response to our \"setup modules\"\r\nrequest to ML. Attaching here for posterity:\r\n>\r\n> <details>\r\n> <summary>Setup Modules Failure Response</summary>\r\n> \r\n> ```json\r\n> {\r\n> \"jobs\": [\r\n> { \"id\": \"v3_linux_anomalous_network_port_activity\", \"success\": true },\r\n> {\r\n> \"id\": \"v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> ],\r\n> \"type\": \"search_phase_execution_exception\",\r\n> \"reason\": \"all shards failed\",\r\n> \"phase\": \"query\",\r\n> \"grouped\": true,\r\n> \"failed_shards\": [\r\n> {\r\n> \"shard\": 0,\r\n> \"index\":\r\n\".ml-anomalies-custom-v3_linux_network_configuration_discovery\",\r\n> \"node\": \"dKzpvp06ScO0OxqHilETEA\",\r\n> \"reason\": {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> }\r\n> ]\r\n> },\r\n> \"status\": 503\r\n> }\r\n> }\r\n> ],\r\n> \"datafeeds\": [\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_port_activity\",\r\n> \"success\": true,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false\r\n> },\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> }\r\n> ],\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> },\r\n> \"status\": 404\r\n> }\r\n> }\r\n> ],\r\n> \"kibana\": {}\r\n> }\r\n> \r\n> ```\r\n> </details>\r\n\r\nThis branch, then, fixes said issue by (relatively simply) retrying the\r\nfailed API call until it succeeds.\r\n\r\n### Related Issues\r\nAddresses:\r\n- https://github.com/elastic/kibana/issues/171426\r\n- https://github.com/elastic/kibana/issues/187478\r\n- https://github.com/elastic/kibana/issues/187614\r\n- https://github.com/elastic/kibana/issues/182009\r\n- https://github.com/elastic/kibana/issues/171426\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n- [x] [ESS Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6528)\r\n- [x] [Serverless Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6529)\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"3df635ef4a8c86c41c91ac5f59198a9b67d1dc8b"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","labelRegex":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/188155","number":188155,"mergeCommit":{"message":"[Detection Engine] Addresses Flakiness in ML FTR tests (#188155)\n\n## Summary\r\n\r\nThe full chronicle of this endeavor can be found\r\n[here](#182183), but [this\r\ncomment](https://github.com/elastic/kibana/pull/182183#issuecomment-2221517519)\r\nsummarizes the identified issue:\r\n\r\n> I [finally\r\nfound](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6516#01909dde-a3e8-4e47-b255-b1ff7cac8f8d/6-2368)\r\nthe cause of these failures in the response to our \"setup modules\"\r\nrequest to ML. Attaching here for posterity:\r\n>\r\n> <details>\r\n> <summary>Setup Modules Failure Response</summary>\r\n> \r\n> ```json\r\n> {\r\n> \"jobs\": [\r\n> { \"id\": \"v3_linux_anomalous_network_port_activity\", \"success\": true },\r\n> {\r\n> \"id\": \"v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> ],\r\n> \"type\": \"search_phase_execution_exception\",\r\n> \"reason\": \"all shards failed\",\r\n> \"phase\": \"query\",\r\n> \"grouped\": true,\r\n> \"failed_shards\": [\r\n> {\r\n> \"shard\": 0,\r\n> \"index\":\r\n\".ml-anomalies-custom-v3_linux_network_configuration_discovery\",\r\n> \"node\": \"dKzpvp06ScO0OxqHilETEA\",\r\n> \"reason\": {\r\n> \"type\": \"no_shard_available_action_exception\",\r\n> \"reason\":\r\n\"[ftr][127.0.0.1:9300][indices:data/read/search[phase/query]]\"\r\n> }\r\n> }\r\n> ]\r\n> },\r\n> \"status\": 503\r\n> }\r\n> }\r\n> ],\r\n> \"datafeeds\": [\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_port_activity\",\r\n> \"success\": true,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false\r\n> },\r\n> {\r\n> \"id\": \"datafeed-v3_linux_anomalous_network_activity\",\r\n> \"success\": false,\r\n> \"started\": false,\r\n> \"awaitingMlNodeAllocation\": false,\r\n> \"error\": {\r\n> \"error\": {\r\n> \"root_cause\": [\r\n> {\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> }\r\n> ],\r\n> \"type\": \"resource_not_found_exception\",\r\n> \"reason\": \"No known job with id 'v3_linux_anomalous_network_activity'\"\r\n> },\r\n> \"status\": 404\r\n> }\r\n> }\r\n> ],\r\n> \"kibana\": {}\r\n> }\r\n> \r\n> ```\r\n> </details>\r\n\r\nThis branch, then, fixes said issue by (relatively simply) retrying the\r\nfailed API call until it succeeds.\r\n\r\n### Related Issues\r\nAddresses:\r\n- https://github.com/elastic/kibana/issues/171426\r\n- https://github.com/elastic/kibana/issues/187478\r\n- https://github.com/elastic/kibana/issues/187614\r\n- https://github.com/elastic/kibana/issues/182009\r\n- https://github.com/elastic/kibana/issues/171426\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios\r\n- [x] [Flaky Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was\r\nused on any tests changed\r\n- [x] [ESS Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6528)\r\n- [x] [Serverless Rule Execution FTR x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6529)\r\n\r\n\r\n### For maintainers\r\n\r\n- [x] This was checked for breaking API changes and was [labeled\r\nappropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)","sha":"3df635ef4a8c86c41c91ac5f59198a9b67d1dc8b"}}]}] BACKPORT-->
A test failed on a tracked branch
First failure: CI Build - main
The text was updated successfully, but these errors were encountered: