Set 250 as an upper limit for retries #27797

MuazOthman · 2023-09-12T19:54:43Z

This PR sets an upper limit of 250 (inclusive) for all config values specifying the number of retries in the cloud, including:

retries when set to a number
retries.runMode
retries.openMode
retries.experimentalOptions.maxRetries
experimentalBurnIn.default
experimentalBurnIn.flaky

PR Tasks

Have tests been added/updated?
[na] Has a PR for user-facing changes been opened in cypress-documentation?
[na] Have API changes been updated in the type definitions?

cypress · 2023-09-12T20:53:42Z

24 flaky tests on run #51044 ↗︎

0	28100	1345	0	24

Details:

Fixed unit tests
Project: cypress	Commit: `558447fe6d`
Status: Passed	Duration: 18:05 💡
Started: Sep 15, 2023 1:45 AM	Ended: Sep 15, 2023 2:03 AM

commands/net_stubbing.cy.ts • 1 flaky test • 5x-driver-firefox

View Output Video

Test		Artifacts
network stubbing > waiting and aliasing > can spy on a 304 not modified image response		`Output`

e2e/origin/commands/assertions.cy.ts • 1 flaky test • 5x-driver-firefox

View Output Video

Test		Artifacts
cy.origin assertions > #consoleProps > .should() and .and()		`Output`

cypress/cypress.cy.js • 3 flaky tests • 5x-driver-firefox

View Output Video

Test		Artifacts
... > correctly returns currentRetry		`Output`
... > correctly returns currentRetry		`Output`
... > correctly returns currentRetry		`Output`

runs.cy.ts • 1 flaky test • app-e2e

View Output Video

Test		Artifacts
... > displays each run with correct information		`Test Replay` `Output` `Screenshots`

specs_list_latest_runs.cy.ts • 1 flaky test • app-e2e

View Output Video

Test		Artifacts
App/Cloud Integration - Latest runs and Average duration > when no runs are recorded > shows placeholders for all visible specs		`Test Replay` `Output` `Screenshots`

The first 5 flaky specs are shown, see all 12 specs in Cypress Cloud.

Review all test suite changes for PR #27797 ↗︎

AtofStryker

any idea how this behaves when retries is set on the describe or test level? I wonder if its worth while to add a test to verify an error is thrown when the following happens. I'd think a system-test in retries_spec is sufficient?

describe('foo', { retries: 260 }, () => {
  it('tests', () => {
    expect(true).to.be.false
  })
})

mabela416 · 2023-09-13T16:57:07Z

The error message should mention the limit of retries

MuazOthman · 2023-09-15T00:17:11Z

@AtofStryker I'm not sure we have any example of such test. We don't even test validating retries at the test level.

MuazOthman · 2023-09-15T00:17:56Z

@mabela416 I pushed an update refactoring validation and providing more detailed error messaging

mabela416

The new validation allows me to set runMode and openMode to a boolean without setting anything else when that shouldn't be the case. This is the error we should be getting

MuazOthman · 2023-09-15T16:37:59Z

@mabela416 I believe setting runMode or openMode or both as booleans should be a valid state. If I understand it correctly, these boolean values mean "enable retries in that mode", and we do have defaults for all other values that can be utilized in that case.
Can you verify if this is makes sense, @ryanpei?

mabela416 · 2023-09-15T16:39:22Z

@mabela416 I believe setting runMode or openMode or both as booleans should be a valid state. If I understand it correctly, these boolean values mean "enable retries in that mode", and we do have defaults for all other values that can be utilized in that case. Can you verify if this is makes sense, @ryanpei?

If you check develop or even the feature/test-burn-in branch, if you try to just do, you'll get an error

{
 retries: {
  runMode: true
  openMode: true
 }
}

MuazOthman · 2023-09-15T16:42:39Z

@mabela416 correct, using a boolean for these fields is a new option added within this initiative. See: https://github.com/cypress-io/cypress/pull/27412/files#diff-9fde1fb2f7eaba5ea49b585c811461495bfbba88e614e900a96863ecfe6b980aR2842

mabela416 · 2023-09-15T16:45:09Z

@mabela416 correct, using a boolean for these fields is a new option added within this initiative. See: https://github.com/cypress-io/cypress/pull/27412/files#diff-9fde1fb2f7eaba5ea49b585c811461495bfbba88e614e900a96863ecfe6b980aR2842

I believe that is only the case IF experimentalStrategy is going to be set, otherwise it defaults to the current api which doesn't allow for booleans. Tagging @AtofStryker as well since he implemented this logic
"The config validation for this is a bit wonky in order to support the currently supported config of runMode and openMode. If none of the experimental keys are present, runMode and openMode can be the current API. However, if experimentalStrategy is provided, runMode and openMode MUST be booleans. Additionally, if the experimentalStrategy is valid, and experimentalOptions are provided, both keys MUST be present."
#27412

AtofStryker · 2023-09-19T15:50:48Z

@mabela416 I believe setting runMode or openMode or both as booleans should be a valid state. If I understand it correctly, these boolean values mean "enable retries in that mode", and we do have defaults for all other values that can be utilized in that case. Can you verify if this is makes sense, @ryanpei?

If you check develop or even the feature/test-burn-in branch, if you try to just do, you'll get an error
{
 retries: {
  runMode: true
  openMode: true
 }
}

@mabela416 this is expected since there is no experimentalStrategy defined, which means runMode and openMode either need to be unset, numbers, or be null.

AtofStryker · 2023-09-19T16:52:07Z

@AtofStryker I'm not sure we have any example of such test. We don't even test validating retries at the test level.

@MuazOthman We do test this is it just isn't obvious. I am working on a review description to help point you in the correct place, but we might want to change the way we approach this PR. More details soon!

AtofStryker

Sorry it has taken me so long to re-review this PR.

I think there is quite a bit going on in this PR and it is trying to accomplish two things:

set the retries limit to 250, inclusive of burn and other properties
refactor the messaging/validation logic for experimental retries.

What I think we might want to do here, since some might view the 250 limit on retries a breaking change, is PR the validation update for current GA retries to 250 into develop and create a CHANGELOG entry as a bugfix. This would also be a more refined changed, and we can add the config tests and system test in testConfigOverrides, which we would just need a test in testConfigOverrides cypress specs that tests the retries threshold. A current example is this system test, which runs this spec and fails since the override is invalid. Either myself or someone on the app team can help you with this since sometimes getting this set up isn't exactly obvious.

Then, when that change goes in, merge it into feature/test-burn-in feature branch, and create a PR that goes into feature/test-burn-in that sets the limits mentioned in the PR descriptions and refactors the messaging validation logic to handle default merging when no value is present. Technically the later can be done independently if we think that is a better option.

Does that sound like an OK plan moving forward?

AtofStryker · 2023-09-19T15:56:22Z

packages/server/test/unit/config_spec.js

@@ -883,29 +883,29 @@ describe('lib/config', () => {
      })

      context('retries', () => {
-        const retriesError = 'a positive number or null or an object with keys "openMode" and "runMode" with values of numbers, booleans, or nulls, or experimental configuration with key "experimentalStrategy" with value "detect-flake-but-always-fail" or "detect-flake-and-pass-on-threshold" and key "experimentalOptions" to provide a valid configuration for your selected strategy'
+        // const retriesError = 'a positive number or null or an object with keys "openMode" and "runMode" with values of numbers, booleans, or nulls, or experimental configuration with key "experimentalStrategy" with value "detect-flake-but-always-fail" or "detect-flake-and-pass-on-threshold" and key "experimentalOptions" to provide a valid configuration for your selected strategy'


Suggested change

// const retriesError = 'a positive number or null or an object with keys "openMode" and "runMode" with values of numbers, booleans, or nulls, or experimental configuration with key "experimentalStrategy" with value "detect-flake-but-always-fail" or "detect-flake-and-pass-on-threshold" and key "experimentalOptions" to provide a valid configuration for your selected strategy'

AtofStryker · 2023-09-19T16:37:25Z

packages/config/__snapshots__/validation.spec.ts.js

-  'type': 'a positive number or null or an object with keys "openMode" and "runMode" with values of numbers, booleans, or nulls, or experimental configuration with key "experimentalStrategy" with value "detect-flake-but-always-fail" or "detect-flake-and-pass-on-threshold" and key "experimentalOptions" to provide a valid configuration for your selected strategy',
+  'key': 'mockConfigKey.experimentalStrategy',
+  'value': 'foo',
+  'type': 'one of "detect-flake-but-always-fail", "detect-flake-and-pass-on-threshold"',


These errors on config validation are way more readable and thank you for cleaning them up!

The one thing that is a bit critical here though is making sure that defaults get applied into the config so the app can access them through the client/server, which was one of the reasons it was an 'all or nothing' approach originally. For example:

openMode: true, experimentalStrategy: 'detect-flake-but-always-fail'

Is a valid end user config, but we need to add defaults to the object in the app so they can be globally available. This needs to be transformed into:

openMode: true, runMode: false, experimentalStrategy: 'detect-flake-but-always-fail'. experimentalOptions: { maxRetries: 2, stopIfAnyPassed: false, }

defaultValue can take a function, but it doesn't expose the values present in the config at time of invocation.

The only thing I can think to accomplish this without adding a new mechanism, is to add defaults in the object when validation occurs since these happen by reference. Which means the validate function would have side effects ☹️

AtofStryker · 2023-09-19T16:40:44Z

packages/config/src/validation.ts


      if (!isValidStopIfAnyPasses) {
-        return false
+        return errMsg(`${key}.stopIfAnyPassed`, value.stopIfAnyPassed, 'null or boolean')


stopIfAnyPassed I believe can either be a boolean or undefined but null is not an option based on 1137. If this has changed, we need to update the typings definition in the app in cypress.d.ts (which we might need to do anyway since the current implementation forces a user to specify it if they provide experimentalOptions on detect-flake-but-always-fail

mabela416 · 2023-09-19T18:37:22Z

@mabela416 I believe setting runMode or openMode or both as booleans should be a valid state. If I understand it correctly, these boolean values mean "enable retries in that mode", and we do have defaults for all other values that can be utilized in that case. Can you verify if this is makes sense, @ryanpei?

If you check develop or even the feature/test-burn-in branch, if you try to just do, you'll get an error
{
 retries: {
  runMode: true
  openMode: true
 }
}
@mabela416 this is expected since there is no experimentalStrategy defined, which means runMode and openMode either need to be unset, numbers, or be null.

So you're expected to get an error right? Something in this PR is allowing the user to set runMode and openMode as booleans without experimentalStrategy and that shouldn't be the case, it should be an error. @AtofStryker

AtofStryker · 2023-09-19T19:41:29Z

@mabela416 I believe setting runMode or openMode or both as booleans should be a valid state. If I understand it correctly, these boolean values mean "enable retries in that mode", and we do have defaults for all other values that can be utilized in that case. Can you verify if this is makes sense, @ryanpei?

If you check develop or even the feature/test-burn-in branch, if you try to just do, you'll get an error
{
 retries: {
  runMode: true
  openMode: true
 }
}
@mabela416 this is expected since there is no experimentalStrategy defined, which means runMode and openMode either need to be unset, numbers, or be null.
So you're expected to get an error right? Something in this PR is allowing the user to set runMode and openMode as booleans without experimentalStrategy and that shouldn't be the case, it should be an error. @AtofStryker

I think I read this as the inverse. You are correct that configuration should error. It might be due to some of the reasons I outlined in #27797 (comment). We might want to treat the configuration changes as a separate PR.

MuazOthman · 2023-11-01T17:59:03Z

As previously discussed, we don't need this new limit of 250 anymore. I extracted the enhanced validation to a new PR #28214.
Closing this one.

MuazOthman added 2 commits September 12, 2023 12:31

Set 250 as an upper limit for retries

c17e8e6

Add and update tests for retries upper limit

832b2fd

MuazOthman requested review from AtofStryker, ryanpei, mabela416 and jennifer-shehane September 12, 2023 20:54

AtofStryker reviewed Sep 12, 2023

View reviewed changes

refactored to provide more detailed error messages

ff515d1

Fixed unit tests

558447f

mabela416 reviewed Sep 15, 2023

View reviewed changes

AtofStryker self-requested a review September 15, 2023 20:21

AtofStryker requested changes Sep 19, 2023

View reviewed changes

jennifer-shehane requested review from cacieprins and removed request for jennifer-shehane and ryanpei September 22, 2023 14:43

jennifer-shehane assigned cacieprins Sep 28, 2023

MuazOthman closed this Nov 1, 2023

MuazOthman deleted the muaz/validate-retries-upper-limit branch January 4, 2024 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set 250 as an upper limit for retries #27797

Set 250 as an upper limit for retries #27797

MuazOthman commented Sep 12, 2023

cypress bot commented Sep 12, 2023 •

edited

Loading

AtofStryker left a comment

mabela416 commented Sep 13, 2023 •

edited

Loading

MuazOthman commented Sep 15, 2023

MuazOthman commented Sep 15, 2023

mabela416 left a comment

MuazOthman commented Sep 15, 2023

mabela416 commented Sep 15, 2023

MuazOthman commented Sep 15, 2023

mabela416 commented Sep 15, 2023

AtofStryker commented Sep 19, 2023

AtofStryker commented Sep 19, 2023

AtofStryker left a comment

AtofStryker Sep 19, 2023

AtofStryker Sep 19, 2023 •

edited

Loading

AtofStryker Sep 19, 2023

mabela416 commented Sep 19, 2023 •

edited

Loading

AtofStryker commented Sep 19, 2023

MuazOthman commented Nov 1, 2023

Set 250 as an upper limit for retries #27797

Set 250 as an upper limit for retries #27797

Conversation

MuazOthman commented Sep 12, 2023

PR Tasks

cypress bot commented Sep 12, 2023 • edited Loading

24 flaky tests on run #51044 ↗︎

Review all test suite changes for PR #27797 ↗︎

AtofStryker left a comment

Choose a reason for hiding this comment

mabela416 commented Sep 13, 2023 • edited Loading

MuazOthman commented Sep 15, 2023

MuazOthman commented Sep 15, 2023

mabela416 left a comment

Choose a reason for hiding this comment

MuazOthman commented Sep 15, 2023

mabela416 commented Sep 15, 2023

MuazOthman commented Sep 15, 2023

mabela416 commented Sep 15, 2023

AtofStryker commented Sep 19, 2023

AtofStryker commented Sep 19, 2023

AtofStryker left a comment

Choose a reason for hiding this comment

AtofStryker Sep 19, 2023

Choose a reason for hiding this comment

AtofStryker Sep 19, 2023 • edited Loading

Choose a reason for hiding this comment

AtofStryker Sep 19, 2023

Choose a reason for hiding this comment

mabela416 commented Sep 19, 2023 • edited Loading

AtofStryker commented Sep 19, 2023

MuazOthman commented Nov 1, 2023

cypress bot commented Sep 12, 2023 •

edited

Loading

mabela416 commented Sep 13, 2023 •

edited

Loading

AtofStryker Sep 19, 2023 •

edited

Loading

mabela416 commented Sep 19, 2023 •

edited

Loading