Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Attempting Cypress test fixes #186562

Closed
wants to merge 89 commits into from

Conversation

rylnd
Copy link
Contributor

@rylnd rylnd commented Jun 20, 2024

🚧 This is based on #181926, and is me attempting to fix cypress tests without mucking up the activity log there.

rylnd added 30 commits April 25, 2024 21:52
This is mostly based on the current test plan. It's not wired up yet,
nor are there any actual implementations.
These now have type errors, since ML rules don't yet accept suppression
fields. We have our next task!
`node scripts/openapi/generate`
We're now asserting that suppression fields are present on the generated
alerts, which they're not, because we haven't implemented them yet.
That's the next step!
* Adds call getIsSuppressionActive in our rule executor, and necessary
  dependencies
* Adds suppression fields to ML rule schema
* Adds feature flag for ML suppression
I noticed that it doesn't look like we're including a lot of timing info
in the ML executor; adding this to validate that, and document what we
_are_ recording.
This will light up the paths that we need to implement. Next!
This adds all the parameters necessary to invoke this method (if
relevant) in the ML rule executor. Given the relative simplicity of the
ML rule type, I'm guessing that many of these values are
irrelevant/unused in this case, but I haven't yet investigated that.

Next step is to exercise this implementation against the FTR tests, and
see if the behavior is what we expect. Once that's done, we can try to
pare down what we need/use.

I also added some TODOs in the course of this work to check some
potential bugs I noticed.
Tests were failing as rules were being created without suppression
params. Fixed!
We've got suppression fields making it into ML alerts for the first
time!

Now, to test the various suppression conditions.
I realized that most of these tests were using es_archiver to insert
anomalies into an index, but our tests were only ever using a single one
of those anomalies. In order to ensure these tests are independent of
the data in that archive, I've created and leveraged a helper to delete
all the persisted anomalies, and then use existing tooling to manually
insert the anomalies needed for our tests.

All of the current tests are green; there are just a few more
permutations that still need to be implemented.
This tests all of the interesting permutations of alert suppression for
ML rules, both with per-execution and interval suppression durations. I
added a few TODOs noting unexpected (to me) behavior; we'll see what
others think.
The behavior demonstrated in this test is in fact expected, as the
suppression duration window applies to the alert creation time, not the
original anomaly time.
Most other rule types have both a "fill" task and a "fillAndContinue"
task; this adds that pattern for ML rules on the Define step.
These are failing because I haven't yet enabled the suppression UI for
ML rules. Once that's done, we can start validating these tests.
I don't know if we need to touch this file, but I'm making a note to
come back to it.
This mainly involved modifying `useAlertSuppression`, as well as some
logic in the rule form's data parsing. At this point, I believe the
frontend should be working.
This just adds some mock values for our new params related to
suppression. All of the test coverage is in our integration tests.
We missed an import of a referenced type.
I think the failure here was because the value `agent.name` has multiple
matching fields, which caused this task to fail. By using the down arrow
to select the first matching field for a given value, this task is now
much more robust.
These tests actually uncovered a deficiency in the ML rule creation
flows, where autocomplete is not correct. This means that it's
currently impossible to add/edit alert suppression for an existing ML
rule (via the UI).

Details in elastic#183100.
rylnd and others added 15 commits June 17, 2024 23:20
There were no less than four assertions in this test that relied on
there being no other rules present in the environment, but nothing was
being done to ensure that was the case. I can't imagine why these were
skipped!
I want to run these in the flaky runner to get a sense of how/where
they're still failing, for now.
We were over-eagerly disabling these fields when the ML checks were not
relevant.
 Conflicts:
	x-pack/test/security_solution_cypress/config.ts
The undefined value was coming from our abstracted hook, and the hook
now correctly handles the undefined value. This union is no longer
needed.
The former just fails with "element not found" while the current version
will actually show the text that it did find.
These changes are taken from my investigation in elastic#182183. Note that this
only changes this configuration for one of the ML jobs, but there are
two contained in this archive.

Since the tests are currently only concerned with the first job, so am
I. If in the course of investigation I find that we're not using the
other job's anomalies, I'll be deleting them since it's just test
overhead (and potentially a source of more bugs).
There was an additional parameter added to the rule parameters. Rather
than keep this test in sync with all of the possible parameters, we just
assert on the ones we care about.
I believe this is the cause of some sporadic failures when running these
tests together, as they all now have the implicit "start job when rule
is created" functionality.

By stopping the datafeeds before each suite, none of the relevant jobs
should be in the full "started" state required by our validation, so we
effectively have a "clean" state.

In figuring the above out I was also made aware of the "force stop
datafeed and close job" endpoint, which would probably be a more robust
solution, but for now this works.
The rule params are typed as a union of all possible allowed values of
`machine_learning_job_id`, while our helpers just expect `string[]`.
Since the value in question is verifiably a `string[]`, I'm telling TS
as much.
I saw this explicitly not work locally, but maybe CI is playing by
different rules.
@rylnd
Copy link
Contributor Author

rylnd commented Jun 20, 2024

/ci

The issue that the previous method (treating the first item different
from the rest) seems to be a consequence of there being "state" in the
dropdown list after the first item is selected. By hitting ESC after
each item is typed/selected, we get rid of this state, and so we can
treat each item separately again.
@rylnd
Copy link
Contributor Author

rylnd commented Jun 20, 2024

/ci

2 similar comments
@rylnd
Copy link
Contributor Author

rylnd commented Jun 21, 2024

/ci

@rylnd
Copy link
Contributor Author

rylnd commented Jun 21, 2024

/ci

I'm not sure this will have the intended effect (of making the job state
more deterministic), but it's worth a shot.
@rylnd rylnd force-pushed the test_ml_cypress_fixes branch from 3414ff1 to 4adac8b Compare June 21, 2024 20:48
This endpoint will return a 404 if the job(s) being requested are not
found. This should not fail the test.
@rylnd
Copy link
Contributor Author

rylnd commented Jun 21, 2024

/ci

rylnd added 2 commits June 21, 2024 16:50
It was previously possible to get into a state where the fields were not
loading, but were also an empty array. Since the test simply waits for
the field to be enabled, this meant there was a chance it would attempt
to fill unavailable options in the combobox. This removes that
possibility by ensuring that fields are present before enabling the
field.
We have the analogous setting in the ess config, and the test-specific
flags that get respected on CI, but I had not run the serverless cypress
locally before, and neglected to add this config for that case.
@kibana-ci
Copy link
Collaborator

kibana-ci commented Jun 21, 2024

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] Serverless Detection Engine - Security Solution Cypress Tests #2 / Machine Learning Detection Rules - Creation with Alert Suppression when ML jobs have run when all jobs are running allows a rule with interval suppression to be created and displayed allows a rule with interval suppression to be created and displayed
  • [job] [logs] Serverless Detection Engine - Security Solution Cypress Tests #2 / Machine Learning Detection Rules - Creation with Alert Suppression when ML jobs have run when all jobs are running allows a rule with per-execution suppression to be created and displayed allows a rule with per-execution suppression to be created and displayed
  • [job] [logs] Serverless Detection Engine - Security Solution Cypress Tests #4 / Machine Learning Detection Rules - Editing with Alert Suppression allows editing of a rule to remove suppression configuration allows editing of a rule to remove suppression configuration
  • [job] [logs] Serverless Detection Engine - Security Solution Cypress Tests #4 / Machine Learning Detection Rules - Editing without Alert Suppression allows editing of a rule to add suppression configuration allows editing of a rule to add suppression configuration

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
securitySolution 5507 5509 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 13.6MB 13.6MB +3.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 84.2KB 84.3KB +68.0B
Unknown metric groups

References to deprecated APIs

id before after diff
securitySolution 576 577 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@rylnd rylnd closed this Jul 9, 2024
@rylnd rylnd deleted the test_ml_cypress_fixes branch July 9, 2024 22:14
@rylnd
Copy link
Contributor Author

rylnd commented Jul 9, 2024

Closed in favor of the related FTR investigation here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants