-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TestDockerStart flaky test #21681
Fix TestDockerStart flaky test #21681
Conversation
Some changes are done to give more resilience to the test: * Wait till image pull is finished, and retry in case of failure. * Checked events are filtered by container id instead of image name, so tests are not affected by other containers that may be running in the system. * Check timeout is for all events now, instead of being reset after an event is received. * Container is removed after test is finished.
Pinging @elastic/integrations-platforms (Team:Platforms) |
jenkins run the tests again please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Just left a suggestion/question to evaluate. No blocking though.
// Image already available, do nothing | ||
return nil | ||
} | ||
for retry := 0; retry < 3; retry++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we could make it configurable like having this number of retries to be a function variable. Also maybe to add sleep time between each of the retries, however not sure if this is needed at all since not familiar with the nature of the errors that may occur here.
Suggestion would look like:
imagePullWithRetry(image string, retries int, interval int)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, but I think this wouldn't be needed by now, and in any case we wouldn't expose these settings in the public methods.
I would prefer to add them if we need it at some moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Some changes are done to give more resilience to the test: * Wait till image pull is finished, and retry in case of failure. * Checked events are filtered by container id instead of image name, so tests are not affected by other containers that may be running in the system. * Check timeout is for all events now, instead of being reset after an event is received. * Container is removed after test is finished. (cherry picked from commit a79dddc)
Some changes are done to give more resilience to the test: * Wait till image pull is finished, and retry in case of failure. * Checked events are filtered by container id instead of image name, so tests are not affected by other containers that may be running in the system. * Check timeout is for all events now, instead of being reset after an event is received. * Container is removed after test is finished. (cherry picked from commit a79dddc)
* upstream/master: feat: package aliases for snapshots (elastic#21960) [DOC] Add firewall as possible troubleshooting issue (elastic#21743) [Filebeat] Add max_number_of_messages config parameter for S3 input (elastic#21993) [Elastic Agent] Fix missing elastic_agent event data (elastic#21994) Document auditbeat system process module config (elastic#21766) Update links (elastic#22012) dynamically find librpm (elastic#21936) Fix Istio docs (elastic#22019) [beats-tester][packaging] store packages in another location (elastic#21903) [Kubernetes] Remove redundant dockersock volume mount (elastic#22009) [Ingest Manager] Always try snapshot repo for agent upgrade (elastic#21951) Azure storage metricset values not inside the metricset name (elastic#21845) fix diskio and memory bugs under windows (elastic#21992) Fix TestDockerStart flaky test (elastic#21681) filebeat: add SSL options to checkpoint module (elastic#19560) Stop storing stateless kubernetes keystores (elastic#21880) [Elastic Agent] Fix named pipe communication on Windows 7 (elastic#21931) [Elastic Agent] Fix index for Agent monitoring to to elastic_agent. (elastic#21932)
Some changes are done to give more resilience to the test:
so tests are not affected by other containers that may be running
in the system.
event is received.
Fixes #20360.