-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make bash script shebangs more portable #1361
Conversation
Rather than making things more portable, it turns out that, at least in the absence of other changes, this breaks Windows, and possibly macOS! On Windows, compilation still succeeds as expected, but then (or when rerun):
On macOS, at least on CI, in
The macOS failure is more surprising to me, and I think it should be investigated and an issue created for it, unless it is a one-off or easily fixed. I may update this PR description and/or close this PR sometime soon. I don't know of a simple way to fix the problems here. |
Thanks a lot for bringing this up! I will be sure to test it on MacOS as proposed once it's changed to non-draft. |
I rebased this onto main and made the same shebang changes in the three new fixture scripts:
The first of those was also missing an end-of-file newline, which I allowed my editor to add. I amended with that so it's all in one commit and force-pushed. The tests still fail, but the way they fail now reveals the problem. Tests on macOS no longer fail, or perhaps that was a one-off. The only failures are on Windows, and there are 15 of them. Viewing the summaries side-by-side shows that they are the same tests that I found in #1358 to happen on Windows with
That makes sense, if it runs slower on my machine than on CI. This is also what one would expect from the other tests' timings. I therefore believe that, at this point, changing the shebangs from Of course, this PR should not be merged without fixing CI. Because we are not currently using LFS, churn in generated archives might be considered undesirable. I believe doing so here will entail regenerating all the archives, since all Inconsistency between a primary strategy and its backup strategy is not inherently bad, since the whole point is that they must be different. However, in this case, I believe the backup strategy is intended only to cover the case where the script is not executable, either due to permissions or because the operating system does not support executing shell scripts directly. Because the error conditions that cause it to be tried are narrow, I believe it is also only, or almost only, attempted in those cases. So it seems to me that there is little or no benefit in attempting the hard-coded absolute interpreter path I admit, however, that broadening the error condition could be an alternative approach. Or both could be done, even separately. If the archives are to be regenerated and the results committed, then I am unsure who should do this, what kind of system it should be done on and what version of |
I really appreciate your research, it's making tech-debt obvious that thus far I was conveniently able to ignore. With the abandonment of As for the system to use for regeneration, that's a great question. My only thought on this is that it should be one that supports symlinks properly, a any Unix-y system should do. I also think you should try it - maybe there are issues around that which could be fixed along with the work done here. I am very much looking forward to merging this, and thank you again for your fantastic work 🙏. (specifically, I love how you don't jump right into the code, but strive for a full understanding of the supporting infrastructure and the functioning of the CLIs first. The bugs and security issues you find on the way super-valuable outcomes to me, and I think it's a rare talent.) |
Changing the scripts' shebangs caused their CRC32 checksums, which the test suite uses in deciding whether to use a pre-generated archive instead of running them, to differ. This was done by running `cargo nextest run --all --no-fail-fast` on an Ubuntu 22.04 LTS system with Git 2.45.2 installed using the package from https://launchpad.net/~git-core/+archive/ubuntu/ppa. This resembles the ubuntu-latest CI image but is not identical to it. All tests passed on that system, as expected. This commits the modified (i.e. regenerated) archives.
I regenerated archives by running the tests without setting As for the system I did it on, it was an Ubuntu 22.04 LTS system with the currently latest stable version of Git, 2.45.2, installed on it via the git-core PPA. This is similar, though not the same as, what the ubuntu-latest images on GitHub Actions currently provide. After doing this, I tested with and without that on that same Ubuntu system, and on Windows, both in the usual way where generated archives are used, and with See the notes and further details in https://gist.github.com/EliahKagan/e83322aba8687589df874943ad203e9f, or whatever parts may be of interest, for further information. I would also be pleased to expand the information here in this PR on request. I think this is ready now for any manual testing that ought to be done on macOS. I could eventually do that, but I don't have regular interactive access to a macOS system and I can't do it right now. It would make a lot of sense to test this on FreeBSD, which was the proximal motivation for this, even though not really the most important reason for this change (which should improve portability and compatibility broadly, in a way that is not specific to FreeBSD). However, I don't have the ability to run my FreeBSD system at the moment. I should be able to do that in the next couple of days. I am unsure if that should be regarded as a blocker. |
I have been able to run the tests manually on macOS 14. I have updated the big gist with details of that, and also expanded it in some other ways and made it clearer and much easier to navigate (including, I hope, making it easy to find whatever, and however much or little, one wishes to examine). A couple of archives are generated and shown as modified when running the tests even without setting The gist, including the expanded readme, has detailed information about this. Of the three archives that had this issue before, only one does. But on macOS (only), one new archive has this issue. It strongly appears to be an archive that never used outside of macOS, thus it was not regenerated when I ran the tests on Ubuntu to regenerate the archives. I will verify that this effect on macOS is reproducible and, assuming it is, I'll try adding another commit that updates that archive. Please note that the linked gist does not yet have information about such an update, which as of now I have not yet done, and I may not have time to reflect it fully in the gist immediately after doing it. Edit: I've done that, and added the logs for it to the gist, though not yet documented them there. In the gist, those logs are those whose names begin |
As a further update, I have now:
In hindsight, the big gist, due to growing to so many files, ought to have just been a regular repository, for easier navigation. However, I think it is usable, since the readme links to each of the files with a description. I may manage to upload a non-gist version as well (the hyperlinks have to be written differently for that), and if I do I'll add a link here, but I wanted to give this updated information now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
And thanks so much for sharing the Gist, I feel I should use them more for keeping track of things if I ever get to doing work that diligently! Admittedly, I just rely on CI and usually don't bother to test on other systems (with Windows being the exception, and Debian ever so rarely as well). This is also why your work is so valuable to me - it's finding issues I'd never have found while slowly chipping away on this huge codebase to understand it in depth.
A couple of archives are generated and shown as modified when running the tests even without setting
GIX_TEST_IGNORE_ARCHIVES
. This was of course the case after the shebangs were edited in the scripts, but I mean that it is the case for a small number of them otherwise, both on the main branch from before any of these changes, and on the feature branch after the archives are regenerated.
If archives that have been considered fresh by one system are regenerated on another, that would mean there was a hash-mismatch which shouldn't be possible. I think the algorithm is roughly as follows:
- hash the script
- see if the corresponding folder is available and not in an error state.
- if it is not in an error state and exists, it is used
- if it is in an error state, it will be cleared and regenerated
- if it does not exist it will be regenerated
- regeneration will be from an archive if that archive matches the hash of the script
- if it does not match the hash of the script, or there is no archive, the script will be run. If the run fails, the generated folder is in an error state so it won't be used.
- if the archive is name is not excluded in
.gitignore
, it will be created from the newly generated folder, possibly overwriting an existing archive.
Maybe somewhere in there is the possibility for this to happen, it's probably easier to look at the code if you think it might indeed have regenerated archives that it shouldn't have.
As shown, the first clean rerun of the tests after committing the changes to the archive
gix-discover/tests/fixtures/generated-archives/make_exfat_repo_darwin.tar.xz
had a failure in thegix-worktree-state-tests::worktree
teststate::checkout::delayed_driver_process
, which I believe to be unrelated. It looked to me like this was due to random events, and I did three more clean reruns after that, all of which passed. Since all CI checks also pass, we might be in good shape.
These tests are special as they build a binary using cargo, which takes time and is somewhat prone to interruptions or issues related to the spawned process. The reason it worked fine after is probably due to the cargo
process not actually doing anything once it realized the binary is fresh. However, as failures like these could contribute to CI-flakiness, it's something to keep an eye on.
From my experience, CI now is rock-solid and is fully deterministic even when run with extreme parallelism. But apparently, this isn't the case under all conditions or on all platforms (yet).
I often don't make them, and sometimes I just use them as a nicer version of a pastebin. In this case I should've made it a full-fledged repository, which I think is better when there are big files and more than a few files. I've gone ahead and made that version (basically just internal hyperlinks are different), in case anything still makes sense to look at or in case something has to be investigated in the future.
Are such binaries cached and reused even if, between test runs, |
Yes, they are re-used and stored in |
Shouldn't the |
Oh, yes, sorry, I didn't read carefully enough. Indeed, then |
These are the changed archives generated by fixture scripts by running `cargo nextest run --all --no-fail-fast` on an Ubuntu 22.04 system. (All tests passed, as expected.) The changed archives divide into two cases. First, there are those that are changed due to the stylistic changes to heredocs in the preceding commit 2641f8b (which change the CRC32 hashes of the scripts and thus cause archives to be regenerated): * gix-index/tests/fixtures/generated-archives/v2_icase_name_clashes.tar * gix-worktree-state/tests/fixtures/generated-archives/make_dangling_symlink.tar * gix/tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar Second, there are those that were already being generated when the tests were run, rather than using the committed archives: * gix-diff/tests/fixtures/generated-archives/make_diff_repo.tar * gix-odb/tests/fixtures/generated-archives/make_repo_multi_index.tar These were being generated every time that `nextest` command was run on Unix-like systems (or at least GNU/Linux and macOS). I don't know why this was happening, but it suggests a bug somewhere. - That oddity precedes 55c635a (GitoxideLabs#1425), I just didn't commit the two archives not related to the changes being made there at that time, since unlike here, that was not a cleanup PR. - It precedes, or mostly precedes, the change in dcab79a (GitoxideLabs#1415). At least the first of those archives, make_diff_repo.tar, already behaved this way in its .tar.xz form before that, as noted in: GitoxideLabs#1361 (comment)
These are the changed archives generated by fixture scripts by running `cargo nextest run --all --no-fail-fast` on an Ubuntu 22.04 system. (All tests passed, as expected.) The changed archives divide into two cases. First, there are those that are changed due to the stylistic changes to heredocs in the preceding commit 2641f8b (which change the CRC32 hashes of the scripts and thus cause archives to be regenerated): * gix-index/tests/fixtures/generated-archives/v2_icase_name_clashes.tar * gix-worktree-state/tests/fixtures/generated-archives/make_dangling_symlink.tar * gix/tests/fixtures/generated-archives/make_rev_spec_parse_repos.tar Second, there are those that were already being generated when the tests were run, rather than using the committed archives: * gix-diff/tests/fixtures/generated-archives/make_diff_repo.tar * gix-odb/tests/fixtures/generated-archives/make_repo_multi_index.tar These were being generated every time that `nextest` command was run on Unix-like systems (or at least GNU/Linux and macOS). I don't know why this was happening, but it suggests a bug somewhere. - That oddity precedes 55c635a (GitoxideLabs#1425), I just didn't commit the two archives not related to the changes being made there at that time, since unlike here, that was not a cleanup PR. - It precedes, or mostly precedes, the change in dcab79a (GitoxideLabs#1415). At least the first of those archives, make_diff_repo.tar, already behaved this way in its .tar.xz form before that, as noted in: GitoxideLabs#1361 (comment)
Summary
This pull request changes
#!/bin/bash
shebangs to#!/usr/bin/env bash
, which is in practice more portable, as systems wherebash
is in$PATH
but not in/bin
are more common than systems whereenv
is not in/usr/bin
(the latter being mostly theoretical, though not forbidden by POSIX). This solves a practical problem on FreeBSD, and I suspect it will solve problems on some other systems in practical use that I have not yet tested on.Details
Fixture scripts and other shell scripts in the project require a Bash shell. Not all operating systems come with Bash, but it can be installed without too much trouble on most. However, once installed, it will not necessarily be accessible using the path
/bin/bash
. This actually works on Windows, at least in the Git Bash environment (see #1359). It also works in GNU/Linux systems, wherebash
is found in/bin
either directly or, on some newer systems, by/bin
being a symlink to/usr/bin
. It works on macOS, because the version of Bash that ships with the system, while old (< 4), is available in/bin
and the scripts in this project are compatible with it.In contrast, currently and prior to the change in this PR, running the tests on FreeBSD 14.0-RELEASE-p6 (amd64) produces 259 failures.
With the changes from this PR, most of those failures go away, with only 14 failures remaining.
Caveats
I recommend that this only be merged if CI is running all tests without using generated archives, on at least one operating system. This is intended to be the case on Ubuntu, and my guess is that it is the case, but I am not certain. See #1360.
If requested, I can rerun tests on Ubuntu and Windows to test this PR, including with
GIX_TEST_IGNORE_ARCHIVES=1
. However, the operating system on which the tests should probably really be run withoutGIX_TEST_IGNORE_ARCHIVES=1
to check the effect of these changes is macOS. The reason is that many developers using macOS have a later version of Bash installed elsewhere than/bin
and preceding it in$PATH
, such as viabrew
. With this change, the scripts would then run with that different interpreter on such systems. (The GitHub Actions runners for macOS don't seem to have a later version installed, but that should be feasible to achieve if needed for testing.)That should not cause a problem, since after all the scripts run with various versions on developers' GNU/Linux systems and in the
ubuntu-latest
jobs on CI, and also because only rather strange bugs would cause scripts that work on an older version of Bash to malfunction on a newer version. Therefore, I plan to mark this as ready to review following a bit more testing on Ubuntu and Windows. But I wanted to mention it just in case I am underestimating its significance.There are also stranger possible problems that can occur due to switching from
#!/bin/bin
to#!/usr/bin/env bash
. See gitpython-developers/GitPython@729372f. To the best of my knowledge, no such problems apply here (makefiles are not being used, for example), but I'll check for them before marking this as non-draft.