-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry for OpenSearch Container #304
Conversation
Maybe we can either fix the underlying issue or use something like https://www.npmjs.com/package/retry instead of rolling our own? |
@dblock I'm looking into finding/fixing the source myself. I'm just checking if we can use retry to ease the pain of manually retrying failed jobs. Unfortunately, the retry I implemented in the first commit will still fail all 3 attempts if the first attempt fails. Now I'm using this PR to test out different strategies. My current suspicion is that TAP is somehow looking for TS or JSX files as if |
Found the cause and a workaround. Please check the updated description for details. |
I hate it but I think it works ;) |
Berfore we merge this, is there a way to retry the tests at GHA level? I'm worried about hacks on top of hacks. |
Could look into https://github.com/marketplace/actions/retry-action and https://github.com/marketplace/actions/retry-step. Validate container start up docker and execute retry logic. |
I thought of retry-step for the Container Startup step but decided not to use it:
I agree that grepping for a piece of text is not pretty 😞. There are potentially more elegant workarounds if we want to invest more time. |
Maybe we can use this: https://hub.docker.com/r/willfarrell/autoheal/ |
@dblock @harshavamsi https://hub.docker.com/r/willfarrell/autoheal/ is not appropriate for this situation because it's mean for crashed services, not crashed containers. |
Testing a few more times to make sure that the problem is solved. |
I spoke too soon. This bug doesn't always crash the container when the opensearch processed is killed. It only does that sometimes. |
Signed-off-by: Theo Truong <theotr@amazon.com>
@nhtruong This is pretty clean retry-wise. Do you want to merge it or it doesn't work? |
@dblock I've tested this on over 700 runs and non failed. So, it's good to go! |
Signed-off-by: Theo Truong <theotr@amazon.com> Signed-off-by: Theo Truong <theotr@amazon.com>
* Manually sync CHANGELOG.md, release-drafter.yml, RELEASING.md, and USER_GUIDE.md * Version Bump: 2.3.0 (opensearch-project#546) * Upgrade and secure the backport workflow (opensearch-project#547) * Add serialization and deserialization of numerals larger than `Number.MAX_SAFE_INTEGER` (opensearch-project#544) * Add upgrading NPM to all workflows running older Node.js versions (opensearch-project#545) * Bump @types/node from 20.1.4 to 20.2.5 (opensearch-project#528) * Bump @aws-sdk/types from 3.329.0 to 3.341.0 (opensearch-project#527) * Bump ora from 6.3.0 to 6.3.1 (opensearch-project#524) * Bump node-fetch from 3.2.10 to 3.3.1 (opensearch-project#526) * Bump eslint from 8.39.0 to 8.41.0 (opensearch-project#525) * Bump @types/node from 18.16.0 to 20.1.4 (opensearch-project#518) * Bump xmlbuilder2 from 3.1.0 to 3.1.1 (opensearch-project#519) * Bump semver from 7.3.8 to 7.5.1 (opensearch-project#520) * Added timursaurus to MAINTAINERS.md (opensearch-project#517) * Bump @aws-sdk/types from 3.257.0 to 3.329.0 (opensearch-project#516) * Bump @babel/eslint-parser from 7.21.3 to 7.21.8 (opensearch-project#513) * Bump simple-git from 3.17.0 to 3.18.0 (opensearch-project#512) * Bump xmlbuilder2 from 3.0.2 to 3.1.0 (opensearch-project#503) * Bump eslint from 8.38.0 to 8.39.0 (opensearch-project#504) * Bump @types/node from 18.15.11 to 18.16.0 (opensearch-project#499) * Bump ora from 6.1.2 to 6.3.0 (opensearch-project#500) * Bump prettier from 2.8.7 to 2.8.8 (opensearch-project#501) * Implemented Docker Image caching for `integration-unreleased` workflow (opensearch-project#498) * [Bug] Fixed bundler.yml (opensearch-project#497) * Bumped Version 2.2.1 In preparation for release (opensearch-project#495) * Bump @types/node from 18.15.10 to 18.15.11 (opensearch-project#493) * Bump eslint from 8.33.0 to 8.38.0 (opensearch-project#494) * Bump rimraf from 4.4.0 to 5.0.0 (opensearch-project#492) * [CCI] feat: add missing createConnection type (opensearch-project#490) * Add Search guide (opensearch-project#489) * Create search guide (opensearch-project#488) * [CCI] Create advanced index actions guide (opensearch-project#483) * [CCI] Connection `request` method callback (opensearch-project#478) * feat: NotCompatibleError (opensearch-project#484) * Bump @babel/eslint-parser from 7.19.1 to 7.21.3 (opensearch-project#486) * Bump jsdoc from 4.0.0 to 4.0.2 (opensearch-project#485) * Bump prettier from 2.8.4 to 2.8.7 (opensearch-project#487) * Create document lifecycle guide (opensearch-project#481) * Create index lifecycle guide (opensearch-project#482) * Create bulk guide (opensearch-project#480) * [CCI] Create index_template guide (opensearch-project#479) * Bump split2 from 4.1.0 to 4.2.0 (opensearch-project#467) * Bump deepmerge from 4.3.0 to 4.3.1 (opensearch-project#468) * Bump @types/node from 18.15.3 to 18.15.10 (opensearch-project#462) * Bump eslint-config-prettier from 8.6.0 to 8.8.0 (opensearch-project#463) * [CCI] Update developerGuide regarding yarn troubleshoot steps (opensearch-project#456) * [CCI] Bump caniuse-lite from 1.0.30001249 to 1.0.30001469 (opensearch-project#459) * added the solution for the possible error during yarn installation on Windows OS (opensearch-project#453) * Fixed deprecation warnings (opensearch-project#446) * [CCI] Don't run tests on PRs with doc changes only (opensearch-project#441) * [CCI] Add pull request template (opensearch-project#440) * Bump @types/node from 17.0.45 to 18.15.3 (opensearch-project#448) * Bump simple-git from 3.16.0 to 3.17.0 (opensearch-project#447) * [CCI] Update Developer Guide (opensearch-project#436) * [CCI] Remove waitCluster in Integration Tests (opensearch-project#423) * Bump tsd from 0.25.0 to 0.27.0 (opensearch-project#431) * Bump rimraf from 4.1.1 to 4.4.0 (opensearch-project#432) * [CCI] doc: fix grammar (opensearch-project#427) * Fix deprecated folder mapping "./" in the "exports" field (opensearch-project#416) * [CCI] fix: pass required `data` argument to SerializationError class (opensearch-project#419) * Bump tap from 16.3.0 to 16.3.4 (opensearch-project#413) * [CCI] Refactor: Remove unnecessary `data` argument when invoking `OpenSearchClientError` (opensearch-project#421) * Bump deepmerge from 4.2.2 to 4.3.0 (opensearch-project#414) * Updated Maintainers list and changelogs (opensearch-project#409) * Downgraded @types/node to 17 from 18 due to breaking changes. (opensearch-project#405) * Bump aws4 from 1.11.0 to 1.12.0 (opensearch-project#406) * Bump prettier from 2.8.3 to 2.8.4 (opensearch-project#407) * Bump minimist from 1.2.7 to 1.2.8 (opensearch-project#388) * Bump tsd from 0.24.1 to 0.25.0 (opensearch-project#389) * Add missing types for AwsSigv4SignerOptions.service (opensearch-project#377) * Created untriaged issue workflow. (opensearch-project#386) Daniel (dB.) Doubrovkine* 2/14/23, 12:14 PM * Bump dezalgo from 1.0.3 to 1.0.4 (opensearch-project#383) * Bump simple-statistics from 7.7.0 to 7.8.3 (opensearch-project#384) * Bump eslint from 8.32.0 to 8.33.0 (opensearch-project#379) * Allow fields in BulkOperation to be optional (opensearch-project#378) * Bump @types/node from 18.11.18 to 18.11.19 (opensearch-project#380) * Bump ora from 5.4.1 to 6.1.2 (opensearch-project#376) * Bump @aws-sdk/types from 3.226.0 to 3.257.0 (opensearch-project#375) * Bump secure-json-parse from 2.4.0 to 2.7.0 (opensearch-project#369) * Bump simple-git from 3.15.1 to 3.16.0 (opensearch-project#373) * Updated user guide to include Amazon OpenSearch Serverless (opensearch-project#372) * Added Support for AOSS (opensearch-project#366) * Bump rimraf from 3.0.2 to 4.1.1 (opensearch-project#370) * Bump eslint-config-prettier from 8.5.0 to 8.6.0 (opensearch-project#368) * Bump hpagent from 0.1.2 to 1.2.0 (opensearch-project#361) * Bump eslint from 8.30.0 to 8.32.0 (opensearch-project#362) * Fixed missing namespace for API-PIT endpoints (opensearch-project#364) * Bump prettier from 2.7.1 to 2.8.3 (opensearch-project#363) * Ensure Dependabot PR workflow retriggers on label change (opensearch-project#360) * Added point in time APIs (opensearch-project#348) * Bump @types/node from 15.14.7 to 18.11.18 (opensearch-project#353) * Bump split2 from 3.2.2 to 4.1.0 (opensearch-project#354) * Bump json5 from 2.2.0 to 2.2.3 (opensearch-project#359) * Bump @aws-sdk/types from 3.190.0 to 3.226.0 (opensearch-project#355) * Updated MAINTAINERS.md to match recommended opensearch-project format. (opensearch-project#358) Daniel (dB.) Doubrovkine* 1/6/23, 8:36 AM * Bump minimist from 1.2.6 to 1.2.7 (opensearch-project#350) * Bump eslint from 7.32.0 to 8.30.0 (opensearch-project#346) * Bump eslint-plugin-prettier from 4.0.0 to 4.2.1 (opensearch-project#351) * Removed test artifacts from gh_pages workflow (opensearch-project#347) * Bump minimatch from 3.0.4 to 3.1.2 (opensearch-project#345) * Bump xmlbuilder2 from 2.4.1 to 3.0.2 (opensearch-project#321) * Implement JSDOC (opensearch-project#337) * Added skip-changelog label (opensearch-project#339) * Bumps simple-git from 3.4.0 to 3.15.0 (opensearch-project#341) * [FEATURE] Allow overriding the aws service identifier in AwsSigv4Signer (opensearch-project#333) * Add release details and bump the jenkins lib version (opensearch-project#319) * Bump Version from 2.1.0 to 2.1.1 (opensearch-project#318) * Bump semver from 7.3.7 to 7.3.8 (opensearch-project#313) * Add release workflows (opensearch-project#317) * issue opensearch-project#291 mutability of headers (opensearch-project#311) * Bump tsd from 0.22.0 to 0.24.1 (opensearch-project#312) * Add changelog and changelog verifier (opensearch-project#306) * Bump prettier from 2.6.2 to 2.7.1 (opensearch-project#310) * Bump @aws-sdk/types from 3.160.0 to 3.190.0 (opensearch-project#309) * Bump node-fetch from 2.6.7 to 3.2.10 (opensearch-project#281) * Update Maintainers List (opensearch-project#308) * Implemented Retry for OpenSearch Container (opensearch-project#304) * Feature/add default credentials provider (opensearch-project#295) * Fix new line problem between diffs (opensearch-project#302) * Removed Unused Variables (opensearch-project#301) * fix awssigv4signer.test.js tests not running (opensearch-project#294) * fix: support TS resolution for .mjs (opensearch-project#296) * Add AwsSigV4 signing functionality (opensearch-project#279) * Fix opensearch-project#253 Cannot read property 'then' of null in Transport.js issue. (opensearch-project#283) * Bump simple-git from 3.5.0 to 3.13.0 (opensearch-project#286) * Bump semver from 7.3.5 to 7.3.7 (opensearch-project#280) * Adds bulk example to README (opensearch-project#277) * Adding Dependabot configuration (opensearch-project#278) * Removing OpenDistro integration tests (opensearch-project#271) * Upgrade dependencies (opensearch-project#272) * fix: add missing memory circuit breaker options (opensearch-project#266) * feat: allow `doc` overwrite in `onDocument` (opensearch-project#263) * Adding link checker workflow (opensearch-project#262) * Using standardized templates from .github (opensearch-project#249) * Adding new OpenSearch versions and updating compatibility matrix (opensearch-project#257) Signed-off-by: Miki <miki@amazon.com>
Description
Tests that run
docker-compose up
randomly fail with "no tests found error". Rerunning the failed jobs resolves the issue. This is a bug in Opensearch build that causes the container to intermittently fail. Restarting the container when encountering this issue is a workaround this PR is addressing.Issues Resolved
closes #268
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.