ML Rule Suppression UI Improvements #9

rylnd · 2024-06-06T22:36:05Z

Summary

Re-enables and adds additional ML cypress tests
Adds ML fields to Define Step
Disables suppression UI when no relevant ML jobs are enabled
Adds warning text when some relevant ML jobs are not enabled

* Disables suppression fields if no relevant ML jobs are running (as we cannot retrieve field info) * Adds a warning message if not all relevant ML jobs are running (as we may be missing some field info) Next step is testing this; we don't currently have a way to run ML rules in cypress, but I'm going to attempt to copy the logic in our FTR to accomplish this.

This was previously only available on the About step, via the useRuleIndices hook in combination with the useFetchIndex hook. Add new composite hook that encapsulates the same logic, and provides it to the define step. Unlike on the About step, we are currently only using this for ML fields as other situations derive their field list from a passed prop (which might be a performance optimization, or a bug, or both).

To do this we need to get the ML API to recognize our jobs as installed and running. They are currently _not_ recognizing this (although there are anomalies in the index). Still troubleshooting to see what's missing, here. This logic was cribbed from the analogous FTR tests, but those also aren't working so *shrug*.

Specifying the `groups` parameter when using the "setup module" API causes the corresponding jobs to be installed _with only the specified group_. This meant that in our FTR tests, we have been installing jobs with the `auditbeat` group. However, part of the contract between ML and Detection Engine is that we use the `group` parameter to determine relevance: if it doesn't belong to either the `security` or (legacy) `siem` group(s), it effectively does not exist to the Detection Engine. This fixes the (very confusing) issue of jobs being installed but not recognized, by specifying a recognized group id (and using our shared constant for it), both in the FTR and cypress utilities.

I have seen the _ecs prefix in a few places, but I'm not quite sure if it's actually part of official ML naming or not. Regardless, using the incorrect name caused the "start datafeed" request to fail with a "no datafeed for job ID" error.

The existing one also references the shared constants for our group IDs, so 👍.

* Cleans up debugging logs * Adds helper for ensuring that jobs are not started at beginning of suite * Fixes form filling utility to support single values for machine_learning_job_id * Updates suppression fields now that we're actually using real fields from anomaly indices

yctercero · 2024-06-13T01:17:48Z

...integration/test_suites/detections_response/utils/machine_learning/machine_learning_setup.ts

@@ -22,7 +23,7 @@ export const executeSetupModuleRequest = async ({
    .set(getCommonRequestHeader('1'))
    .send({
      prefix: '',
-      groups: ['auditbeat'],
+      groups: [ML_GROUP_ID],


Amazing job sticking with it to figure this out. Woohoo to finally having our ML tests able to run.

...lution/public/detection_engine/rule_creation_ui/components/step_define_rule/translations.tsx

Co-authored-by: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com>

The way the cypress helper that consumes these is written, both of these forms work, but we're never going to encounter a rule with the display name params, and knowing the display name but not the ID is not useful for investigative purposes. I can see how this might have been done to prevent needing to change these jobs as their IDs change, but I think it's more likely that those will change than their IDs.

There's not a lot here, but I feel bad for adding anything to step_define_rule so this is an attempt to minimize that. In the course of refactoring I also caught a bug (perhaps just a test environment one) where the form fields are temporarily `undefined` when the hooks are run. I updated the form type to reflect this; hopefully that doesn't have broader impact (but if it does, those are probably also uncaught bugs).

The combination of shared state and retry logic means that asserting exactly 1 rule exists will never work if rule creation succeeds in a previous step. If we instead assert that there is _at least_ the expected number of rules, we have a chance of the retry working.

* Stop datafeeds before creating rule * Simplify jobId logic

Turns out the reason the "Job IDs" were persisted as human-readable text was so that they could be reused for assertions. I still think these should be separate, so I'm adding them back for this specific assertion.

* Adds necessary setup/teardown for ML integration

1. Use "proper" combobox text, and capture it within a helper method I swear I saw this working when I was doing the same stuff for the ML Job picker, but I must have only been dealing with one item, or the items I was selecting were somehow different. Downarrow is _required_ on the first option (a simple "enter" will select nothing), but using downarrow on subsequent options will cause the _second_ suggested item to be selected. E.g. if I type "by_field_value", it suggests both "by_field_value" and "client.by_field_value," and {downarrow}{enter} would cause the latter to be selected. 2. I also ensure that our new ML validations have run (which causes suppression fields to be disabled) before attempting to interact with the suppression fields, as this was causing some flakiness now that these checks are done async 3. I also fixed the broken `clearAlertSuppressionFields` task, which had never work but also had never been exercised since the relevant test was skipped.

There were no less than four assertions in this test that relied on there being no other rules present in the environment, but nothing was being done to ensure that was the case. I can't imagine why these were skipped!

I want to run these in the flaky runner to get a sense of how/where they're still failing, for now.

We were over-eagerly disabling these fields when the ML checks were not relevant.

rylnd changed the title ~~ML Rule Suppression Improvements~~ ML Rule Suppression UI Improvements Jun 10, 2024

vitaliidm mentioned this pull request Jun 11, 2024

[Detection Engine] Adds Alert Suppression to ML Rules elastic/kibana#181926

Merged

7 tasks

rylnd force-pushed the ml_rule_suppression_warnings branch from e59a189 to 5d0f0b3 Compare June 11, 2024 19:14

rylnd added 5 commits June 11, 2024 14:19

Add copy for new ML rule warnings

56de5f6

style: sort StepDefineRule arguments

dfeb857

rylnd force-pushed the ml_rule_suppression_warnings branch from 5d0f0b3 to 4955831 Compare June 11, 2024 21:58

rylnd added 4 commits June 12, 2024 17:04

Use correct job ID in cypress tests

cf5ceae

I have seen the _ecs prefix in a few places, but I'm not quite sure if it's actually part of official ML naming or not. Regardless, using the incorrect name caused the "start datafeed" request to fail with a "no datafeed for job ID" error.

Replace duplicated, magic-stringed function for existing one

87c079d

The existing one also references the shared constants for our group IDs, so 👍.

yctercero reviewed Jun 13, 2024

View reviewed changes

nastasha-solomon reviewed Jun 17, 2024

View reviewed changes

...lution/public/detection_engine/rule_creation_ui/components/step_define_rule/translations.tsx Outdated Show resolved Hide resolved

...lution/public/detection_engine/rule_creation_ui/components/step_define_rule/translations.tsx Outdated Show resolved Hide resolved

rylnd and others added 12 commits June 17, 2024 17:00

More direct copy for user action

65e16e8

Co-authored-by: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com>

Update ML warning copy per docs' suggestions

e6d85e2

Co-authored-by: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com>

Clean up existing ML cypress test

61d24d4

* Stop datafeeds before creating rule * Simplify jobId logic

Bring back job display names for assertions

99deac3

Turns out the reason the "Job IDs" were persisted as human-readable text was so that they could be reused for assertions. I still think these should be separate, so I'm adding them back for this specific assertion.

Re-enable ML rule edit tests

b0970bf

* Adds necessary setup/teardown for ML integration

Ensure test has clean setup

742503b

There were no less than four assertions in this test that relied on there being no other rules present in the environment, but nothing was being done to ensure that was the case. I can't imagine why these were skipped!

Remove exclusivity from FTR tests, add TODO for re-skipping

f5cbaa5

I want to run these in the flaky runner to get a sense of how/where they're still failing, for now.

Fix suppression fields for non-ML cases

8e9d4c5

We were over-eagerly disabling these fields when the ML checks were not relevant.

rylnd marked this pull request as ready for review June 18, 2024 04:29

rylnd merged commit e6aae21 into ml_rule_alert_suppression Jun 18, 2024
1 check passed

rylnd deleted the ml_rule_suppression_warnings branch June 18, 2024 04:30

rylnd mentioned this pull request Jun 18, 2024

[Security Solution][Detection Engine] ML Rule forms have incorrect autocomplete fields elastic/kibana#183100

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML Rule Suppression UI Improvements #9

ML Rule Suppression UI Improvements #9

rylnd commented Jun 6, 2024 •

edited

Loading

yctercero Jun 13, 2024

ML Rule Suppression UI Improvements #9

ML Rule Suppression UI Improvements #9

Conversation

rylnd commented Jun 6, 2024 • edited Loading

Summary

yctercero Jun 13, 2024

Choose a reason for hiding this comment

rylnd commented Jun 6, 2024 •

edited

Loading