-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1
#1657
Run CI on Windows with GIX_TEST_IGNORE_ARCHIVES=1
#1657
Conversation
Some of these fail. We report the step as failing if any fail, which currently at least 14 and usually 15 are expected to (one of them is a performance test). If more than 15 fail, we'll fail this job, which will fail the workflow overall, but the job is still deliberately not treated as a required check for PR auto-merge. Right now, the expected number of failures is delibreately set too low, to 13, which is unlikely to be satisfied. This is just to test the new job. After verifying that it can trigger job failure when an excessive number of test case failures occurs, and fixing any readily apparent bugs in the job definition, this value will be raised, probably to 15. See GitoxideLabs#1358 for information on the known-failing tests on Windows with `GIX_TEST_IGNORE_ARCHIVES=1`.
- Check the specific step outcome (and specifically the outcome, not the conclusion which is turned into `success` by the value of `continue-on-error`). Without this, the last step never runs, so the test does not fail when it is supposed to. - Give the last step a fixed (literal) name, since the `env` key does not interpolate into its name like a `matrix` value would. Accordingly, move the `env` key into the step, since it is now only used inside its logic. - Line-buffer standard output for the `cargo nextest` command for more immedate (less staggered) output. This may not happen automatically since its stdout is a pipe. (I think its stderr is already either unbuffered or line-buffered even when not a terminal.)
This removes `stdbuf -oL`. I had added the `stdbuf -oL` along with some other changes, but it seems the most likely of any of them to be producing the numerous new failures reporting "error while loading shared libraries: C: cannot open shared object file" (treating `C:` as though it is itself the name of a DLL to load) such as: FAIL [ 0.028s] gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible --- STDOUT: gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible --- running 1 test test pipeline::convert_to_git::only_driver_means_streaming_is_possible ... FAILED failures: failures: pipeline::convert_to_git::only_driver_means_streaming_is_possible test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 47 filtered out; finished in 0.02s --- STDERR: gix-filter::filter pipeline::convert_to_git::only_driver_means_streaming_is_possible --- failed to extract 'tests\fixtures\generated-archives\pipeline_repos.tar': Ignoring archive at 'tests\fixtures\generated-archives\pipeline_repos.tar' as GIX_TEST_IGNORE_ARCHIVES is set. thread 'pipeline::convert_to_git::only_driver_means_streaming_is_possible' panicked at tests\tools\src\lib.rs:567:17: fixture script of "bash" "D:\\a\\gitoxide\\gitoxide\\gix-filter\\tests\\fixtures\\pipeline_repos.sh" failed: stdout: stderr: 0 [main] bash 547 C:\Program Files\Git\usr\bin\bash.exe: *** fatal error - error while loading shared libraries: C: cannot open shared object file: No such file or directory note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace There were about 700 such failures, with a total of 719 failing tests. Let's see if no longer running the `cargo nextest` command through `stdbuf` makes them go away.
This eliminates the piping, tricky parsing, and special-casing to preserve colorization, and runs the tests and the XML parsing in the default `pwsh` shell (since this is a Windows job), using PowerShell facilities to parse the XML. This also checks that there are no *errors*, in addition to (still) checking that there are no more *failures* than expected. In the preceding commit, five additional tests, not currently noted in GitoxideLabs#1358, failed: FAIL [ 0.010s] gix-credentials::credentials program::from_custom_definition::empty FAIL [ 0.008s] gix-credentials::credentials program::from_custom_definition::name FAIL [ 0.010s] gix-credentials::credentials program::from_custom_definition::name_with_args FAIL [ 0.009s] gix-credentials::credentials program::from_custom_definition::name_with_special_args FAIL [ 0.014s] gix-discover::discover upwards::from_dir_with_dot_dot In addition, one test noted in GitoxideLabs#1358 does not always fail on CI, because it is a performance test and the CI runner is fast enough so that it usually passes: FAIL [ 181.270s] gix-ref-tests::refs packed::iter::performance Let's see if running the tests more similarly to the way they are run on Windows without `GIX_TEST_IGNORE_ARCHIVES`, i.e. without piping and with the Windows default of `pwsh` as the shell, affects any of those new/CI-specific failures.
This should let the `test-fixtures-windows` job succeed. As suspected (but not, as of now, explained), the five additional failures when running the tests in `bash` and piping the output went away with `pwsh` and no pipe. This brings the number of failures down to 14, with the possibility of 15 failing tests if the performance test fails (see discussion in GitoxideLabs#1358). However, while it fails when I run it locally, that performance test seems rarely if ever to fail on CI (anymore?). So let's try setting the maximum number of failing tests, above which the job will report failure, to 14 rather than 15. So that they can be investigated later, the tests that failed with `bash` and `|&` piping of stdout and stderr, when run on Windows on CI with `GIX_TEST_IGNORE_ARCHIVES=1` -- which are listed in the previous commit where the failures were corrected -- have as the most significant reported details as follows: --- STDERR: gix-credentials::credentials program::from_custom_definition::empty --- thread 'program::from_custom_definition::empty' panicked at gix-credentials\tests\program\from_custom_definition.rs:16:5: assertion `left == right` failed: not useful, but allowed, would have to be caught elsewhere left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential- \\\"$@\\\"\" \"--\" \"store\"" right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-\" \"store\"" note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace --- STDERR: gix-credentials::credentials program::from_custom_definition::name --- thread 'program::from_custom_definition::name' panicked at gix-credentials\tests\program\from_custom_definition.rs:64:5: assertion `left == right` failed: we detect that this can run without shell, which is also more portable on windows left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name \\\"$@\\\"\" \"--\" \"store\"" right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-name\" \"store\"" note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace --- STDERR: gix-credentials::credentials program::from_custom_definition::name_with_args --- thread 'program::from_custom_definition::name_with_args' panicked at gix-credentials\tests\program\from_custom_definition.rs:40:5: assertion `left == right` failed left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name --arg --bar=\\\"a b\\\" \\\"$@\\\"\" \"--\" \"store\"" right: "\"C:\\Program Files\\Git\\mingw64\\bin\\git.exe\" \"credential-name\" \"--arg\" \"--bar=a b\" \"store\"" note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace --- STDERR: gix-credentials::credentials program::from_custom_definition::name_with_special_args --- thread 'program::from_custom_definition::name_with_special_args' panicked at gix-credentials\tests\program\from_custom_definition.rs:52:5: assertion `left == right` failed left: "\"sh\" \"-c\" \"C:\\\\Program Files\\\\Git\\\\mingw64\\\\bin\\\\git.exe credential-name --arg --bar=~/folder/in/home \\\"$@\\\"\" \"--\" \"store\"" right: "\"sh\" \"-c\" \"C:\\Program Files\\Git\\mingw64\\bin\\git.exe credential-name --arg --bar=~/folder/in/home \\\"$@\\\"\" \"--\" \"store\"" note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace --- STDERR: gix-discover::discover upwards::from_dir_with_dot_dot --- thread 'upwards::from_dir_with_dot_dot' panicked at gix-discover\tests\discover\upwards\mod.rs:155:5: assertion `left == right` failed left: Reduced right: Full note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace To be clear, those failures do *not* occur in the preivous commit and are not expected to occur in this one. Those prior failures consist of four `gix-credentials` tests and one `gix-discover` test. (The `gix-discover` failure resembles a problem observed locally on a Windows Server 2022 system as an administrator with UAC disabled, reported in GitoxideLabs#1429, and suggests that the conditions of that failure may differ from, or be more complex than, what I had described there.)
39c3fe4
to
319e9d8
Compare
GIX_TEST_IGNORE_ARCHIVES=1
Thanks so much for setting this up!
The timing is indeed very promising as it could fit into the envelope we already have once caching kicks in. Then again, caching, or the lack thereof, should be the same for each job in a new PR, so I'd still think that the new job is 3-4 minutes slower than the longest-running job thus far. Here is the summary that I base this on. Maybe I am wrong about this, so I think it's OK to merge this PR as is and possibly, in a follow up, make it a required job like any other. I'd also think that it sends an email if the job fails, even though the workflow didn't fail, so it won't go unnoticed once I inevitably break something 😅. |
I think that workflow is considered to fail when this job fails, as with the other jobs in the workflow. The workflow status showed as failing in this situation when I tested it on my fork. (Some of this can be seen before the rebase; I did the rebase just after opening this PR instead of just before, in case it would be valuable to ever look at that.) The only way this job is distinguished from most others in the workflow is that it is not listed as a dependency of a required check that attests that all conceptually required checks have completed. But if I understand correctly this is separate from the status of the workflow run: if all the required jobs complete successfully, then the PR is eligible to be merged when auto-merge is enabled, regardless of the status of the workflow run some of whose jobs were required checks. |
Thanks for the clarification! It's true that it is failing the workflow if the dependent check fails, and this will mean an email will be sent. It will still auto-merge earlier as it's not part of the check, which is exactly what it should do. |
This adds a CI test job,
test-fixtures-windows
, that runs tests on Windows withGIX_TEST_IGNORE_ARCHIVES=1
, allows the job to succeed even when the test running step fails, but then, if it failed (as currently it always does), runs another step that checks that no tests reported errors and that the number of tests reporting failures does not exceed 14.Although there are currently 15 tests that regularly fail when run this way, as listed in #1358, the performance test does currently seem ever to fail on CI. Setting it to 14 carries a risk of failure even when nothing new is broken, but should shed light on whether it does ever fail on CI. (If not, then it might be that my test machine is inadequate, rather than an actual performance bug.) This also makes it less likely to miss a new failure, per #1358 (comment) and #1358 (comment).
The approach taken is to add a
with-xml
nextest profile and use it for thecargo nextest
step of the new job, then have the subsequent step use PowerShell to parse the generated XML as discussed in #1358 (comment) and #1358 (comment).This feature branch includes the intermediate approaches rather than squashing them away. I think the details about their effects and disadvantages may be of value in the future. There are also failures that only occur with the other approach I attempted first, presumably relating either to the differing environments of PowerShell and Git Bash or to pecularities arising from
|&
redirection. I'll look into this in more detail sometime fairly soon and I'm not requesting that you examine it, but full details are in the commit messages if you wish to.The main commit message of interest is 145ae49 (before the rebase, a5d3698), which describes the specific advantages of the XML approach over the ad-hoc parsing approach.
This may not be immediately ready, because:
test-fast
job forwindows-latest
, when tested on my fork. Here, it is slower, but I think that is related to caching and would go away once merged. The new job omits some of what is done in thetest-fast
job--for example, it does not run doctests--and that seems to roughly cancel out the added time to run the fixture scripts. In view of this, maybe it should be made a required check? On the other hand, that could slow some things down if we want to allow PRs to be quickly merged even if they introduce new failures on Windows whenGIX_TEST_IGNORE_ARCHIVES=1
.