Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(itest): improve container startup health detection #1534

Merged
merged 9 commits into from
Jun 10, 2023

Conversation

andrewazores
Copy link
Member

Welcome to Cryostat! 👋

Before contributing, make sure you have:

  • Read the contributing guidelines
  • Linked a relevant issue which this PR resolves
  • Linked any other relevant issues, PR's, or documentation, if any
  • Resolved all conflicts, if any
  • Rebased your branch PR on top of the latest upstream main branch
  • Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
  • Signed the last commit: git commit --amend --signoff

Fixes: #1533

Description of the change:

This adds Podman health probes to the containers started for integration testing, and changes the wait-for-X executions to query podman's view of container health as a startup check.

Motivation for the change:

Rather than running a timeout command that forks subprocess shells executing curl to try to check the container availability externally, this ends up using a process run within the container to check for the service availability, which is reported to podman. From outside the containers we ask podman what the status is. This is less prone to getting into a bad state with curl trying to open a connection to the services in the containers when they are not yet ready to accept connections and causing curl to time out and therefore fail the startup check and the integration test run. This should make itest startup much more reliable.

How to manually test:

  1. bash repeated-integration-tests.bash N SomeTestsIT, where N is some large number and SomeTestsIT are a subset of tests run simply to ensure basic startup succeeded. bash repeated-integration-tests.bash 10 HealthIT for example.

@andrewazores andrewazores requested a review from ebaron as a code owner June 9, 2023 18:18
@andrewazores andrewazores requested review from aali309 and tthvo June 9, 2023 18:18
@github-actions github-actions bot added the needs-triage Needs thorough attention from code reviewers label Jun 9, 2023
@andrewazores andrewazores removed the needs-triage Needs thorough attention from code reviewers label Jun 9, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 9, 2023

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1534-a64070ea2b968c391b3699964866cb9c02d71381 sh smoketest.sh

pom.xml Outdated Show resolved Hide resolved
pom.xml Outdated Show resolved Hide resolved
pom.xml Show resolved Hide resolved
tthvo
tthvo previously approved these changes Jun 9, 2023
@andrewazores
Copy link
Member Author

Seems like after one of the recent commits this actually breaks the CI runner entirely, it always seems to time out... unfortunately, not much in the way of logging.

@andrewazores
Copy link
Member Author

I wonder if maybe the CI runner's grep doesn't accept -E or something.

@tthvo
Copy link
Member

tthvo commented Jun 10, 2023

Ohh yeh just realized I just started the 7th attempt....What is version of grep on CI?

@tthvo
Copy link
Member

tthvo commented Jun 10, 2023

Actually, I am experiencing the same issue locally too.

Been stuck at [INFO] --- exec-maven-plugin:3.1.0:exec (wait-for-cryostat) @ cryostat --- for a while.

@tthvo
Copy link
Member

tthvo commented Jun 10, 2023

Maybe the regular expression here is invalid because the input is the while row including the container name, uptime and brackets?

$ podman ps --format '{{.Names}} {{.Status}}' | grep jfr-datasource | grep -E -i 'health'
jfr-datasource-itest Up About a minute (healthy)

$ podman ps --format '{{.Names}} {{.Status}}' | grep jfr-datasource | grep -E -i '^health'
$ echo $?
1

pom.xml Outdated Show resolved Hide resolved
@github-actions
Copy link
Contributor

Test image available:

$ CRYOSTAT_IMAGE=ghcr.io/cryostatio/cryostat:pr-1534-2343b4bd37c3ac739b4fe04788294d9486843dd2 sh smoketest.sh

Copy link
Member

@tthvo tthvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@andrewazores andrewazores merged commit 9401ad1 into cryostatio:main Jun 10, 2023
@andrewazores andrewazores deleted the itest-startup-health branch June 10, 2023 02:51
mergify bot pushed a commit that referenced this pull request Jun 13, 2023
andrewazores added a commit that referenced this pull request Jun 13, 2023
(cherry picked from commit 9401ad1)

Co-authored-by: Andrew Azores <aazores@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

[Bug] integration tests sometimes time out waiting for containers to start
3 participants