Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1 #1657

Merged
merged 7 commits into from
Nov 7, 2024

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Nov 7, 2024

This adds a CI test job, test-fixtures-windows, that runs tests on Windows with GIX_TEST_IGNORE_ARCHIVES=1, allows the job to succeed even when the test running step fails, but then, if it failed (as currently it always does), runs another step that checks that no tests reported errors and that the number of tests reporting failures does not exceed 14.

Although there are currently 15 tests that regularly fail when run this way, as listed in #1358, the performance test does currently seem ever to fail on CI. Setting it to 14 carries a risk of failure even when nothing new is broken, but should shed light on whether it does ever fail on CI. (If not, then it might be that my test machine is inadequate, rather than an actual performance bug.) This also makes it less likely to miss a new failure, per #1358 (comment) and #1358 (comment).

The approach taken is to add a with-xml nextest profile and use it for the cargo nextest step of the new job, then have the subsequent step use PowerShell to parse the generated XML as discussed in #1358 (comment) and #1358 (comment).

This feature branch includes the intermediate approaches rather than squashing them away. I think the details about their effects and disadvantages may be of value in the future. There are also failures that only occur with the other approach I attempted first, presumably relating either to the differing environments of PowerShell and Git Bash or to pecularities arising from |& redirection. I'll look into this in more detail sometime fairly soon and I'm not requesting that you examine it, but full details are in the commit messages if you wish to.

The main commit message of interest is 145ae49 (before the rebase, a5d3698), which describes the specific advantages of the XML approach over the ad-hoc parsing approach.

This may not be immediately ready, because:

  • The new job is deliberately not a required job. But we should make sure it passes in this PR before the PR is merged (even though it has already been run in my fork), since the purpose of this PR is to introduce the job and demonstrate that it passes under existing conditions. (In particular, I just rebased this branch.)
  • Relating to #1358 (comment), it may make sense to make the new job depend on an existing job, but as discussed there I don't know what the best way to do this is. Right now, no such dependency is expressed.
  • This job seems to take the same amount of time as the test-fast job for windows-latest, when tested on my fork. Here, it is slower, but I think that is related to caching and would go away once merged. The new job omits some of what is done in the test-fast job--for example, it does not run doctests--and that seems to roughly cancel out the added time to run the fixture scripts. In view of this, maybe it should be made a required check? On the other hand, that could slow some things down if we want to allow PRs to be quickly merged even if they introduce new failures on Windows when GIX_TEST_IGNORE_ARCHIVES=1.

Some of these fail. We report the step as failing if any fail,
which currently at least 14 and usually 15 are expected to (one of
them is a performance test). If more than 15 fail, we'll fail this
job, which will fail the workflow overall, but the job is still
deliberately not treated as a required check for PR auto-merge.

Right now, the expected number of failures is delibreately set too
low, to 13, which is unlikely to be satisfied. This is just to test
the new job. After verifying that it can trigger job failure when
an excessive number of test case failures occurs, and fixing any
readily apparent bugs in the job definition, this value will be
raised, probably to 15.

See GitoxideLabs#1358 for information on the known-failing tests on Windows
with `GIX_TEST_IGNORE_ARCHIVES=1`.
- Check the specific step outcome (and specifically the outcome,
  not the conclusion which is turned into `success` by the value of
  `continue-on-error`). Without this, the last step never runs, so
  the test does not fail when it is supposed to.

- Give the last step a fixed (literal) name, since the `env` key
  does not interpolate into its name like a `matrix` value would.
  Accordingly, move the `env` key into the step, since it is now
  only used inside its logic.

- Line-buffer standard output for the `cargo nextest` command for
  more immedate (less staggered) output. This may not happen
  automatically since its stdout is a pipe. (I think its stderr is
  already either unbuffered or line-buffered even when not a
  terminal.)
This removes `stdbuf -oL`.

I had added the `stdbuf -oL` along with some other changes, but it
seems the most likely of any of them to be producing the numerous
new failures reporting "error while loading shared libraries: C:
cannot open shared object file" (treating `C:` as though it is
itself the name of a DLL to load) such as:

	    FAIL [   0.028s] gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible
    --- STDOUT:              gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible ---
    running 1 test
    test pipeline::convert_to_git::only_driver_means_streaming_is_possible ... FAILED
    failures:
    failures:
	pipeline::convert_to_git::only_driver_means_streaming_is_possible
    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 47 filtered out; finished in 0.02s

    --- STDERR:              gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible ---
    failed to extract 'tests\fixtures\generated-archives\pipeline_repos.tar': Ignoring archive at 'tests\fixtures\generated-archives\pipeline_repos.tar' as GIX_TEST_IGNORE_ARCHIVES is set.
    thread 'pipeline::convert_to_git::only_driver_means_streaming_is_possible' panicked at tests\tools\src\lib.rs:567:17:
    fixture script of "bash" "D:\\a\\gitoxide\\gitoxide\\gix-filter\\tests\\fixtures\\pipeline_repos.sh" failed: stdout:
    stderr:       0 [main] bash 547 C:\Program Files\Git\usr\bin\bash.exe: *** fatal error - error while loading shared libraries: C: cannot open shared object file: No such file or directory
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

There were about 700 such failures, with a total of 719 failing
tests. Let's see if no longer running the `cargo nextest` command
through `stdbuf` makes them go away.
This eliminates the piping, tricky parsing, and special-casing to
preserve colorization, and runs the tests and the XML parsing in
the default `pwsh` shell (since this is a Windows job), using
PowerShell facilities to parse the XML. This also checks that there
are no *errors*, in addition to (still) checking that there are no
more *failures* than expected.

In the preceding commit, five additional tests, not currently noted
in GitoxideLabs#1358, failed:

    FAIL [   0.010s] gix-credentials::credentials program::from_custom_definition::empty
    FAIL [   0.008s] gix-credentials::credentials program::from_custom_definition::name
    FAIL [   0.010s] gix-credentials::credentials program::from_custom_definition::name_with_args
    FAIL [   0.009s] gix-credentials::credentials program::from_custom_definition::name_with_special_args
    FAIL [   0.014s] gix-discover::discover upwards::from_dir_with_dot_dot

In addition, one test noted in GitoxideLabs#1358 does not always fail on CI,
because it is a performance test and the CI runner is fast enough
so that it usually passes:

    FAIL [ 181.270s] gix-ref-tests::refs packed::iter::performance

Let's see if running the tests more similarly to the way they are
run on Windows without `GIX_TEST_IGNORE_ARCHIVES`, i.e. without
piping and with the Windows default of `pwsh` as the shell, affects
any of those new/CI-specific failures.
This should let the `test-fixtures-windows` job succeed.

As suspected (but not, as of now, explained), the five additional
failures when running the tests in `bash` and piping the output
went away with `pwsh` and no pipe.

This brings the number of failures down to 14, with the possibility
of 15 failing tests if the performance test fails (see discussion
in GitoxideLabs#1358). However, while it fails when I run it locally, that
performance test seems rarely if ever to fail on CI (anymore?).
So let's try setting the maximum number of failing tests, above
which the job will report failure, to 14 rather than 15.

So that they can be investigated later, the tests that failed with
`bash` and `|&` piping of stdout and stderr, when run on Windows on
CI with `GIX_TEST_IGNORE_ARCHIVES=1` -- which are listed in the
previous commit where the failures were corrected -- have as the
most significant reported details as follows:

    --- STDERR:              gix-credentials::credentials program::from_custom_definition::empty ---
    thread 'program::from_custom_definition::empty' panicked at gix-credentials\tests\program\from_custom_definition.rs:16:5:
    assertion `left == right` failed: not useful, but allowed, would have to be caught elsewhere
      left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential- \\\"$@\\\"\" \"--\" \"store\""
     right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-\" \"store\""
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

    --- STDERR:              gix-credentials::credentials program::from_custom_definition::name ---
    thread 'program::from_custom_definition::name' panicked at gix-credentials\tests\program\from_custom_definition.rs:64:5:
    assertion `left == right` failed: we detect that this can run without shell, which is also more portable on windows
      left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name \\\"$@\\\"\" \"--\" \"store\""
     right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-name\" \"store\""
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

    --- STDERR:              gix-credentials::credentials program::from_custom_definition::name_with_args ---
    thread 'program::from_custom_definition::name_with_args' panicked at gix-credentials\tests\program\from_custom_definition.rs:40:5:
    assertion `left == right` failed
      left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name --arg --bar=\\\"a b\\\" \\\"$@\\\"\" \"--\" \"store\""
     right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-name\" \"--arg\" \"--bar=a b\" \"store\""
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

    --- STDERR:              gix-credentials::credentials program::from_custom_definition::name_with_special_args ---
    thread 'program::from_custom_definition::name_with_special_args' panicked at gix-credentials\tests\program\from_custom_definition.rs:52:5:
    assertion `left == right` failed
      left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name --arg --bar=~/folder/in/home \\\"$@\\\"\" \"--\" \"store\""
     right: "\"sh\" \"-c\" \"C:\\Program Files\\Git\\mingw64\\bin\\git.exe credential-name --arg --bar=~/folder/in/home \\\"$@\\\"\" \"--\" \"store\""
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

    --- STDERR:              gix-discover::discover upwards::from_dir_with_dot_dot ---
    thread 'upwards::from_dir_with_dot_dot' panicked at gix-discover\tests\discover\upwards\mod.rs:155:5:
    assertion `left == right` failed
      left: Reduced
     right: Full
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

To be clear, those failures do *not* occur in the preivous commit
and are not expected to occur in this one. Those prior failures
consist of four `gix-credentials` tests and one `gix-discover`
test. (The `gix-discover` failure resembles a problem observed
locally on a Windows Server 2022 system as an administrator with
UAC disabled, reported in GitoxideLabs#1429, and suggests that the conditions
of that failure may differ from, or be more complex than, what I
had described there.)
@EliahKagan EliahKagan force-pushed the run-ci/test-fixtures-windows branch from 39c3fe4 to 319e9d8 Compare November 7, 2024 08:37
@EliahKagan EliahKagan marked this pull request as ready for review November 7, 2024 08:41
@EliahKagan EliahKagan changed the title Run ci/test fixtures windows Run tests on Windows on CI with GIX_TEST_IGNORE_ARCHIVES=1 Nov 7, 2024
@EliahKagan EliahKagan changed the title Run tests on Windows on CI with GIX_TEST_IGNORE_ARCHIVES=1 Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1 Nov 7, 2024
@EliahKagan EliahKagan changed the title Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1 Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1 Nov 7, 2024
@Byron
Copy link
Member

Byron commented Nov 7, 2024

Thanks so much for setting this up!

[ ] This job seems to take the same amount of time as the test-fast job for windows-latest, when tested on my fork. Here, it is slower, but I think that is related to caching and would go away once merged. The new job omits some of what is done in the test-fast job--for example, it does not run doctests--and that seems to roughly cancel out the added time to run the fixture scripts. In view of this, maybe it should be made a required check? On the other hand, that could slow some things down if we want to allow PRs to be quickly merged even if they introduce new failures on Windows when GIX_TEST_IGNORE_ARCHIVES=1.

The timing is indeed very promising as it could fit into the envelope we already have once caching kicks in. Then again, caching, or the lack thereof, should be the same for each job in a new PR, so I'd still think that the new job is 3-4 minutes slower than the longest-running job thus far. Here is the summary that I base this on.

Maybe I am wrong about this, so I think it's OK to merge this PR as is and possibly, in a follow up, make it a required job like any other.

I'd also think that it sends an email if the job fails, even though the workflow didn't fail, so it won't go unnoticed once I inevitably break something 😅.

@Byron Byron merged commit f1fcc19 into GitoxideLabs:main Nov 7, 2024
17 checks passed
@EliahKagan
Copy link
Member Author

I'd also think that it sends an email if the job fails, even though the workflow didn't fail

I think that workflow is considered to fail when this job fails, as with the other jobs in the workflow. The workflow status showed as failing in this situation when I tested it on my fork. (Some of this can be seen before the rebase; I did the rebase just after opening this PR instead of just before, in case it would be valuable to ever look at that.)

The only way this job is distinguished from most others in the workflow is that it is not listed as a dependency of a required check that attests that all conceptually required checks have completed. But if I understand correctly this is separate from the status of the workflow run: if all the required jobs complete successfully, then the PR is eligible to be merged when auto-merge is enabled, regardless of the status of the workflow run some of whose jobs were required checks.

@EliahKagan EliahKagan deleted the run-ci/test-fixtures-windows branch November 7, 2024 09:22
@Byron
Copy link
Member

Byron commented Nov 7, 2024

Thanks for the clarification! It's true that it is failing the workflow if the dependent check fails, and this will mean an email will be sent. It will still auto-merge earlier as it's not part of the check, which is exactly what it should do.
I think I got it now 😁.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants