Too many failing checks / implement smarter CI #3662

adityaraute · 2023-12-25T19:20:56Z

Description

I was scrolling through the PRs and noticed most PRs failing some checks or the others; often the same ones.

I've attached a screenshot where a missing package led to a check fail, but there are several similar instances.

Motivation

If checks are failing, and yet they may not be related to the issue at hand, I wonder if it makes the Check or Action itself irrelevant, at least to the majority of PRs.

Possible Implementation

Reduce the number of actions/checks attached to certain workflows, such that only the ones that are essential for a PR are run.
Additionally, investigate why certain actions fail repeatedly and rectify any issues in the core repo build.

Additional context

An example of an action running ubuntu tests on PRs

agriyakhetarpal · 2023-12-25T20:55:11Z

Thanks for the suggestion, I have myself had a variety of ideas like this in mind since some while based off on recommendations from other repositories. Adding labels to check like [skip tests] and [skip docs] to group tests would be worthwhile and is super easy to add via already-supported GitHub Actions syntax, but the issue with this is about how maintainable this can be in the long term, since the file globs to conditionally check are too complex to accommodate all such cases. For example, if I end up modifying a docstring incorrectly in the pybamm/ folder and skip the doctests in a PR as there will be no changes in docs/, it would mean that failures in the doctests based on that incorrect change may then start to show up elsewhere in future PRs, which would become troublesome for the PR author or maintainers to fix every time. We will need to decide on how to go about this

I suppose skipping the benchmark runs on documentation-only fixes would be fine. It is the longest job and we don't need it to run everywhere, especially if we're not pushing changes to an external repository like we do with a CRON job. I'm aware that SciPy's CI configuration has a lot of such things we can seek inspiration from; for example, they check for the [Build wheels] message in a commit which proves to be useful for debugging when working on compiled code in a PR – and so on. Feel free to pitch in @arjxn-py @Saransh-cpp. If you have any suggestions w.r.t. this based on what you have observed with PyBaMM's infrastructure, please feel free to list them in this thread, @adityaraute.

There is indeed a lot of redundancy with things like repeated workflows and workflow files (the scheduled tests and regular CI tests could be integrated into the same file, #3518 is still open, benchmarks-related jobs can have less files, etc.)

P.S. ~~The error you are receiving locally with the doctests was fixed recently and came from a missing setuptools dependency, could you retry with a cleaned .nox folder and a fresh environment?~~ never mind: I misread and thought this was a test failure on your machine

adityaraute · 2023-12-26T09:46:57Z

I understand the perils of skipping certain checks for certain PRs. But then can it be possible to ensure that checks don't fail without a valid reason? This would mean investigating each workflow and the possibilities where they may be failing erroneously.

agriyakhetarpal · 2023-12-26T10:09:12Z

I suppose the usual practice is to run tests in such a way that they are as upright as they can be (fail on any deprecation warnings detected or so on) and that they are not flaky (do not fail without reason). We regularly experience benchmarks failures due to the absence of fixed CI runners, one way could be to conduct a statistical analysis by looking at the past data for benchmarks available on https://pybamm.org/benchmarks and adjust the thresholds for failure using the normal distribution obtained so that it is experienced once or twice in a hundred pull request runs. The other way is to improve the benchmark times themselves by improving the solver. Both of these are hard to do, but one thing that can be done is to not fail the workflow but let it emit a "status required" message, i.e., not let it get marked with a ❌ but with a ⚠️

As for other workflows, most of them don't fail at random, except for the link checker which sometimes does so on certain links (we have a lot of them throughout the source code)

arjxn-py · 2024-01-06T05:47:49Z

Thanks @adityaraute for opening this, this would really save resources and potentially make development faster. I had also been thinking that we should [skip docs] until & unless there's a documentation related change & benchmarks can be skipped in most of the PRs. +1 for @agriyakhetarpal's suggestion, can we maybe also label complexity & priority for this?
If this doesn't gets picked up for a while, I'd be happy to work on this after i finish up my pending PRs.

agriyakhetarpal · 2024-01-06T09:07:31Z

I agree, though let's keep this open for some while first for more discussion. For now, we have these scenarios:

Skip benchmarks + coverage + unit + integration on docs-related changes
Skip unit + coverage + integration on changes to examples or example scripts?
Don't skip anything if workflow files are modified (we will have to look into this in depth a bit, defeats the purpose)
Add markers like [Build wheels] to trigger wheel builds on a commit that has this string, say if someone is working on compiled code for the IDAKLU solver

We can create file patterns based on these points, though for some of them it might make sense to do so after #3641 is complete. Please let me know if there are more scenarios that can be looked after.

agriyakhetarpal · 2024-01-07T12:37:03Z

Might be helpful for detecting changes across a range of commits on a pull request:

kratman · 2024-06-05T21:47:12Z

@agriyakhetarpal Can we considered this resolved now? There have been a bunch of improvements

agriyakhetarpal · 2024-06-05T21:53:35Z

Not for now, I would say – the recent improvements have been great (#4125, #4099), but we want to do much more on this in the coming months. @prady0t will be starting out with restructuring some parts of the workflows aside from the pytest migration goal in the GSoC timeline, and @cringeyburger will help out with the build-system-related changes

arjxn-py · 2024-08-07T18:31:13Z

@agriyakhetarpal, we might want to have a hypothetical checklist of tasks that one might need to do & a starting point?

agriyakhetarpal · 2024-08-07T18:33:24Z

I think so – the easiest thing to start would be to implement some patterns in the paths: and paths-ignore: fields for the workflows so that we can skip running all the tests for all the PRs. I can add a longer checklist later, that's usually the longer thing to plan out with such issues

agriyakhetarpal · 2024-08-08T23:22:43Z

A good place for anyone who wants to start with this is this resource: https://learn.scientific-python.org/development/guides/gha-basic/#advanced-usage

adityaraute added the feature label Dec 25, 2023

agriyakhetarpal changed the title ~~Too many failing checks~~ Too many failing checks / implement smarter CI Dec 25, 2023

agriyakhetarpal added difficulty: easy A good issue for someone new. Can be done in a few hours priority: medium To be resolved if time allows labels Jan 6, 2024

agriyakhetarpal mentioned this issue Jan 23, 2024

Idaklu-Jax Interface #3658

Merged

8 tasks

kratman mentioned this issue Mar 8, 2024

Disable benchmarks on PRs #3876

Merged

6 tasks

agriyakhetarpal mentioned this issue Mar 15, 2024

Test Optional Dependencies #3892

Merged

6 tasks

agriyakhetarpal mentioned this issue May 17, 2024

Run unit tests and integration tests together to speed up PR builds #4099

Closed

arjxn-py added difficulty: medium Will take a few days difficulty: easy A good issue for someone new. Can be done in a few hours and removed difficulty: easy A good issue for someone new. Can be done in a few hours labels Aug 7, 2024

agriyakhetarpal added the hacktoberfest label Oct 1, 2024

agriyakhetarpal removed the hacktoberfest label Nov 3, 2024

agriyakhetarpal mentioned this issue Dec 13, 2024

Fix all-contributors bot so we can have required workflows #4671

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too many failing checks / implement smarter CI #3662

Too many failing checks / implement smarter CI #3662

adityaraute commented Dec 25, 2023

agriyakhetarpal commented Dec 25, 2023 •

edited

Loading

adityaraute commented Dec 26, 2023

agriyakhetarpal commented Dec 26, 2023

arjxn-py commented Jan 6, 2024

agriyakhetarpal commented Jan 6, 2024

agriyakhetarpal commented Jan 7, 2024

kratman commented Jun 5, 2024

agriyakhetarpal commented Jun 5, 2024

arjxn-py commented Aug 7, 2024

agriyakhetarpal commented Aug 7, 2024

agriyakhetarpal commented Aug 8, 2024

Too many failing checks / implement smarter CI #3662

Too many failing checks / implement smarter CI #3662

Comments

adityaraute commented Dec 25, 2023

Description

Motivation

Possible Implementation

Additional context

agriyakhetarpal commented Dec 25, 2023 • edited Loading

adityaraute commented Dec 26, 2023

agriyakhetarpal commented Dec 26, 2023

arjxn-py commented Jan 6, 2024

agriyakhetarpal commented Jan 6, 2024

agriyakhetarpal commented Jan 7, 2024

kratman commented Jun 5, 2024

agriyakhetarpal commented Jun 5, 2024

arjxn-py commented Aug 7, 2024

agriyakhetarpal commented Aug 7, 2024

agriyakhetarpal commented Aug 8, 2024

agriyakhetarpal commented Dec 25, 2023 •

edited

Loading