Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run all tests in the nix-shell; eliminate docker infrastructure #379

Merged
merged 7 commits into from
Sep 7, 2023

Conversation

spencerkclark
Copy link
Member

@spencerkclark spencerkclark commented Aug 28, 2023

This PR refactors the build infrastructure in this repo to eliminate the need for the Docker component. All development and testing is now done in the nix shell. This should be a quality of life improvement for anyone developing the fortran model, as it no longer requires maintaining checksums in two separate build environments.

In so doing it introduces the following changes:

  • New make rules are provided for compiling the model in different modes:
    • build -- build executables in repro (our production mode) and debug mode.
    • build_repro -- build only the repro mode executable.
    • build_debug -- build only the debug mode executable.
  • Tests are run with each of the executables available in the local bin directory, and are tagged with the associated compile mode.
  • An option, check_layout_invariance, is provided to trigger regression tests be run with a 1x2 domain decomposition instead of a 1x1 domain decomposition to check invariance to the domain decomposition layout; this is used for the all the coarse-graining regression tests and replaces the previous test_run_reproduces_across_layouts test that would run in the docker image.
  • debug-mode and repro-mode simulations produce different answers, which is something we noticed in Bump base image to one with Ubuntu version 20.04 #364 when upgrading compiler versions as well, and so require different reference checksums.

In working on this PR, we ran the fortran model in debug mode in more contexts than we had previously, some of which turned up errors, which we currently work around by using pytest.skip (something we had implicitly already been doing before):

Working on this PR also brought my attention to the fact that pytest's tmpdir fixture does not automatically get cleaned up after each test; pytest versions older than 7.3.0 keep around directories from the last three runs of pytest, which fill up disk space quickly since running these tests requires creating 10's of run directories, each with their own initial conditions and input files (#380). For the time being I manually clean up these run directories after successful tests.

Resolves #340.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are needed to fix debug-mode failures when running with the TKE-EDMF scheme active; they do not change answers. See discussion in #364 (comment) for more context.

@spencerkclark spencerkclark force-pushed the nix-debug-mode branch 4 times, most recently from 3f048ce to cb4185c Compare August 30, 2023 16:52
@spencerkclark spencerkclark marked this pull request as ready for review August 30, 2023 16:57
@spencerkclark
Copy link
Member Author

After this PR is reviewed I will update the required checks to:

  • Minimal fortran and wrapper tests in repro mode
  • Minimal fortran tests in debug mode

Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved
Copy link
Contributor

@brianhenn brianhenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @spencerkclark things worked well for me. Happy to approve if you like or can wait for another set of eyes.

README.md Show resolved Hide resolved
tests/pytest/test_regression.py Show resolved Hide resolved
Copy link
Contributor

@brianhenn brianhenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spencerkclark spencerkclark merged commit adafc50 into master Sep 7, 2023
@spencerkclark spencerkclark deleted the nix-debug-mode branch September 7, 2023 18:56
spencerkclark added a commit that referenced this pull request Sep 7, 2023
This PR fixes errors in test skipping logic introduced in #379 (see [this CI run](https://app.circleci.com/pipelines/github/ai2cm/fv3gfs-fortran/2682/workflows/e71d0119-0c60-44c7-bf4f-d97f10ccbb0b/jobs/5541)).  In that PR these particular tests were configured only to be attempted to be run upon merges to master, because they would be skipped anyway.  

This PR also now ensures that we at least attempt to run the emulation tests in debug mode in CI when the developer requests to, since it will exercise the skipping logic before merging to master (upon which all pure fortran tests are attempted to be run in both repro and debug mode).

[This CI job](https://app.circleci.com/pipelines/github/ai2cm/fv3gfs-fortran/2688/workflows/0c3ca3a2-3b4d-46d6-8cca-a65e1b9a5a0f/jobs/5557) confirms that this PR worked as intended.
spencerkclark added a commit that referenced this pull request Oct 2, 2023
I noticed this when looking over this file earlier, and it would be good to correct for maximum test coverage. This was introduced in #379, so it has not been present for long.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move coarse-graining tests out of docker build
2 participants