Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent test failures caused by image download timeout #12452

Closed
3 tasks done
Garett-MacGowan opened this issue Jan 4, 2024 · 6 comments · Fixed by #12454
Closed
3 tasks done

Frequent test failures caused by image download timeout #12452

Garett-MacGowan opened this issue Jan 4, 2024 · 6 comments · Fixed by #12454
Assignees
Labels
area/build Build or GithubAction/CI issues github_actions Pull requests that update Github_actions dependencies P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/dependencies PRs and issues specific to updating dependencies type/regression Regression from previous behavior (a specific type of bug)

Comments

@Garett-MacGowan
Copy link
Contributor

Garett-MacGowan commented Jan 4, 2024

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

This is a development specific issue. When running the CI, tests frequently (intermittently) fail due to a timeout of the argoexec image download.

image

Here's an example failed workflow

This issue seems to be common & is being experienced by multiple users of the v4 action: actions/download-artifact#249

Potential Solutions

  1. Wrap actions/download-artifact@v4 in a retry action.
  2. Downgrade to actions/download-artifact@v3
  3. Wait for a bug fix for actions/download-artifact@v4 ([bug] Unable to download artifact(s): Unable to download and extract artifact: Request timeout actions/download-artifact#249)

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

N/A

Logs from the workflow controller

N/A

Logs from in your workflow's wait container

N/A
@Garett-MacGowan Garett-MacGowan changed the title Frequent Test Failures Caused by Image Download Timeout Frequent test failures caused by image download timeout Jan 4, 2024
@Garett-MacGowan
Copy link
Contributor Author

If someone provides some feedback here on the preferred solution, I can quickly implement & push.

@Garett-MacGowan
Copy link
Contributor Author

@juliev0 FYI I am observing a failure rate much worse than ~1/3 at the moment.

@agilgur5 agilgur5 added area/build Build or GithubAction/CI issues P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important labels Jan 4, 2024
@agilgur5
Copy link
Contributor

agilgur5 commented Jan 4, 2024

Nice catch!

I would suggest we revert #12384 as it seems like most folks in that issue are doing. We'll want to add an ignore pattern to dependabot.yml for that too so that it doesn't get auto-updated.

Re: retry, we may want to add that anyway as it does still occasionally happen even on v3 (though ideally the action itself would have code to retry...), though it seems to be happening exponentially more with actions/download-artifact@v4. So revert would be top priority to fix this and retry would be a nice to have but not necessary, IMO

@agilgur5 agilgur5 added the type/regression Regression from previous behavior (a specific type of bug) label Jan 4, 2024
@Garett-MacGowan
Copy link
Contributor Author

Looks like the download action already implements a retry, so maybe something else is messed up with the new v4.

I think I'll just revert changes from #12384 and open a new ticket to upgrade to v4 once actions/download-artifact#249 is resolved.

@agilgur5
Copy link
Contributor

agilgur5 commented Jan 4, 2024

Sounds good. We will still want to add the dependabot.yml ignore temporarily until v4 is fixed

@agilgur5 agilgur5 added github_actions Pull requests that update Github_actions dependencies type/dependencies PRs and issues specific to updating dependencies labels Jan 4, 2024
@Garett-MacGowan
Copy link
Contributor Author

Sounds good. We will still want to add the dependabot.yml ignore temporarily until v4 is fixed

Yep, I've got that in. Will push shortly.

Garett-MacGowan added a commit to GarettSoftware/argo-workflows that referenced this issue Jan 4, 2024
…v3 to fix artifact download timeout issues. Add both download-artifact and upload-artifact to dependabot ignore dependency. Fixes argoproj#12452.

Signed-off-by: Garett MacGowan <garettsoftware@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build or GithubAction/CI issues github_actions Pull requests that update Github_actions dependencies P2 Important. All bugs with >=3 thumbs up that aren’t P0 or P1, plus: Any other bugs deemed important type/bug type/dependencies PRs and issues specific to updating dependencies type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants