-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve Flaky CI builds #980
Comments
Discussed with @joelhulen and @SimonCropp, and (probably tomorrow) I will:
|
100% in agreement to move away from AppVeyor in favor of GitHub Actions. We implemented AppVeyor before GitHub Actions was a thing. Plus, AppVeyor gave us free build minutes each month since this is an Open Source project. Since then, GitHub Actions has become a viable (and, in many ways, better) alternative. Moving to GH Actions will also open the door to implementing policies like code coverage, improving deployment options, etc. Thanks, Martin! |
Remove the AppVeyor CI. Contributes to App-vNext#980.
@joelhulen I've deleted the AppVeyor config, but it's still trying to build things. I guess there's something you need to turn off somewhere in the AppVeyor website. |
Add GitHubActionsTestLogger for builds in GitHub Actions. Contributes to App-vNext#980.
@martincostello I'll look into it now. |
@martincostello I've deleted the Polly project from AppVeyor. As for GitHub, we want to create two deployments: GitHub releases and nuget.org. I can help set up the NuGet piece when you're ready. |
As for the releases, those need to be gated so that an admin needs to approve the deployments. |
Cheers Joel - I'm working on the CI now in #995, then I'll sort out xplat, then once that's done look a the publish/release flow. |
The CI is now working correctly with respect to:
Next up I'll look at sorting out the Linux and macOS builds. |
Great progress, Martin. Thanks! |
@martintmk This still happens a little bit, but is mostly resolved. Are there any specific tests you're aware of we need to de-flake. |
@martincostello , the problems i see quite often is that the code coverage check fails for unknown reasons. We should investigate that. |
I've seen that one in other projects, I think it's a coverlet issue. |
We still get this a bit, but I think whatever is breaking it is external. Should we close or leave this open @martintmk? |
Dunno, the situation is not ideal as there are a lot of build failures, especially the https://github.com/App-vNext/Polly/actions/workflows/codeql-analysis.yml If we can stabilize it it would improve things. |
I get that in two of my projects too - the C# compiler seems to crash for some reason with no useful information (I've tried to get more detail from another repo and get nothing). Again, it's external so I don't think there's anything we can do to fix it ourselves. |
We can leave it open, but I've removed the v8 milestones. |
Can we do retries? Maybe try up-to 3 times. It's especially annoying for the bot-approver, i think 2 or 3 of automatic PRs failed because of this. |
I don't know how to do that without having something external like a bot automatically retriggering them. |
This failure looks like flakiness we need to fix though: https://github.com/App-vNext/Polly/actions/runs/6325048079/job/17175769906?pr=1641#step:5:400 |
FYI I've logged an issue with Roslyn about the code-ql failures we keep getting: dotnet/roslyn#70368 |
Since updating to .NET 8, we seems to be getting flaky tests related to telemetry for .NET Framework due to some concurrent state somewhere being updated: example |
Might have to skip that test on .NET Framework - it doesn't seem to want to pass at all now 😓 |
This issue is stale because it has been open for 60 days with no activity. It will be automatically closed in 14 days if no further updates are made. |
Things seem to have been stable for a while now, so I'm going to close this. We can open a new one if it comes back. |
Our AppVeyor CI build has been flaky for a while now, which causes unnecessary complication to making changes, and with different permissions to the GitHub repository makes being able to do things like retry a build difficult to achieve despite write access to the repo.
As a stop-gap measure, I added a Windows CI using GitHub Actions in #979 to give an alternative view of whether the project is building successfully.
The question is, what should we do to push forward and get CI stable again so that contributors can be confident in changes they submit?
Two possible approaches include:
Cards on the table, I'm in favour of item 2 because I use Actions exclusively for my own open source projects and at work, and I'm a big fan of having one CI/CD system, rather than multiples. It also has the benefit of allowing us to build and test on Linux, macOS and Windows in a single CI/CD system.
I am however happy to go with the majority position if that is to keep the status quo of AppVeyor.
If we decided to move to GitHub Actions, the following things are missing from the basic one I've set up already:
The text was updated successfully, but these errors were encountered: