Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows CI is broken #6033

Open
BenWiederhake opened this issue Feb 29, 2024 · 10 comments
Open

Windows CI is broken #6033

BenWiederhake opened this issue Feb 29, 2024 · 10 comments

Comments

@BenWiederhake
Copy link
Collaborator

All Windows CI runs are red.

Looks like we accidentally found a bug in Windows BFD, because it says BFD (GNU Binutils) 2.39 assertion fail:

2024-02-27T23:00:03.8935508Z    Compiling hex-literal v0.4.1
2024-02-27T23:00:03.9829797Z    Compiling unindent v0.2.1
2024-02-27T23:00:20.1646720Z error: linking with `x86_64-w64-mingw32-gcc` failed: exit code: 1
2024-02-27T23:00:20.1648606Z   |
2024-02-27T23:00:20.1921863Z   = note: "x86_64-w64-mingw32-gcc" "-fno-use-linker-plugin" "-Wl,--dynamicbase" "-Wl,--disable-auto-image-base" "-m64" …INCREDIBLY MANY OPTIONS… "-o" "D:\\a\\coreutils\\coreutils\\target\\debug\\deps\\coreutils-49fd98cd30ebb250.exe" "-no-pie" "-nodefaultlibs" "C:\\Users\\runneradmin\\.rustup\\toolchains\\nightly-x86_64-pc-windows-gnu\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\rsend.o"
2024-02-27T23:00:20.2725618Z   = note: Warning: .drectve `-exclude-symbols:"_ZN100_$LT$alloc..string..String$u20$as$u20$core..ops..index..Index$LT$core..ops..range..RangeFull$GT$$GT$5index17h4cf8051789717498E" ' unrecognized
2024-02-27T23:00:20.2730336Z           Warning: .drectve `-exclude-symbols:"_ZN100_$LT$clap_builder..builder..value_parser..EnumValueParser$LT$E$GT$$u20$as$u20$core..clone..Clone$GT$5clone17h46c7d57fd48394ffE" ' unrecognized
2024-02-27T23:00:20.2733227Z           Warning: .drectve `-exclude-symbols:"_ZN100_$LT$core..iter..adapters..fuse..Fuse$LT$I$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$4next17h4a14f274301286fbE" ' unrecognized

… TWO HUNDRED THOUSAND LINES OF "WARNING CORRPT DRECTVE" …

2024-02-27T23:00:38.8349081Z           C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: BFD (GNU Binutils) 2.39 assertion fail ../../../src/binutils-2.39/bfd/coffgen.c:460
2024-02-27T23:00:38.8350405Z           C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: warning: C:\Users\runneradmin\.cargo\registry\src\index.
crates.io-6f17d22bba15001f\windows_x86_64_gnu-0.42.2\lib/libwindows.a(ntdll_dll_s00082.o): local symbol `' has no section
2024-02-27T23:00:38.8351203Z           C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: BFD (GNU Binutils) 2.39 assertion fail ../../../src/binutils-2.39/bfd/coffgen.c:460
2024-02-27T23:00:38.8352046Z           C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: BFD (GNU Binutils) 2.39 assertion fail ../../../src/binutils-2.39/bfd/cofflink.c:2279
2024-02-27T23:00:38.8353343Z           C:/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\runneradmin\.cargo\registry\src\index.crates.io
-6f17d22bba15001f\windows_x86_64_gnu-0.42.2\lib/libwindows.a(ntdll_dll_s00082.o): illegal symbol index -1869611008 in relocs
2024-02-27T23:00:38.8353651Z           collect2.exe: error: ld returned 1 exit status
2024-02-27T23:00:38.8354142Z error: could not compile `coreutils` (test "tests") due to 1 previous error

Taken from #6014 (comment) , affects at least also #6012, #6020.

@cakebaker
Copy link
Contributor

It's probably the following issue: rust-lang/rust#112368

@cre4ture
Copy link
Contributor

cre4ture commented Mar 5, 2024

Is it possible that we switch back to the version of the toolchain that was working before?
After fixing android, its quite anoying to again see the CI constantly red...

@tertsdiepraam
Copy link
Member

I'm not sure we can. We haven't bumped our MSRV right, so what would we switch back to? If you know of a way to fix this I'd happily accept it though.

@cre4ture
Copy link
Contributor

cre4ture commented Mar 6, 2024

My investigations showed thi so far:

  • problem was first occuring here:
    grafik

there was no relevant change in the PR merged

  • the info-log diff showes this:

grafik

apparently, the gcc version did a downgrade from 13.2.0 to 12.2.0 (and changed from MSYS to MinGW) - unclear why

I'm currently working on a change in the CI scripts to achieve this.

@cre4ture
Copy link
Contributor

cre4ture commented Mar 7, 2024

I found part of the reasons of issue.
There is actually a script that installs the new gcc version with this command:
C:/msys64/usr/bin/pacman.exe -Syu --needed mingw-w64-x86_64-gcc --noconfirm
But this command also does updates to other packages.
This way, it tried to also upgrade the version of mintty:

upgrading mintty...
:: To complete this update all MSYS2 processes including this terminal will be closed. Confirm to proceed [Y/n] 

But this update got stuck or crashed. Such that the new gcc wasn't actually installed.
Sadly, the this error there wasn't recognized by the CI. It just continued as if everything was fine.
Unlikily, there is already the gcc 12.2.0 pre-installed on the system, such that the gcc was also not missing. It just had an old version.

Apparently this issue disappeared on its own. My tests today showed that it works again without any modification of this script. Reason: it seems that the mintty is not updated anymore. thus the process doesn't fail as before.

Full log sequence:

Run ## Install/setup prerequisites
  ## Install/setup prerequisites
  case 'x86_64-pc-windows-gnu' in
    arm-unknown-linux-gnueabihf)
      sudo apt-get -y update
      sudo apt-get -y install gcc-arm-linux-gnueabihf
    ;;
    aarch64-unknown-linux-*)
      sudo apt-get -y update
      sudo apt-get -y install gcc-aarch64-linux-gnu
    ;;
    *-redox*)
      sudo apt-get -y update
      sudo apt-get -y install fuse3 libfuse-dev
    ;;
    # Update binutils if MinGW due to https://github.com/rust-lang/rust/issues/112368
    x86_64-pc-windows-gnu)
      C:/msys64/usr/bin/pacman.exe -Syu --needed mingw-w64-x86_64-gcc --noconfirm
      echo "C:\msys64\mingw64\bin" >> $GITHUB_PATH
    ;;
  esac
  case 'windows-latest' in
    macos-latest) brew install coreutils ;; # needed for testing
  esac
  case 'windows-latest' in
    ubuntu-*)
      # pinky is a tool to show logged-in users from utmp, and gecos fields from /etc/passwd.
      # In GitHub Action *nix VMs, no accounts log in, even the "runner" account that runs the commands. The account also has empty gecos fields.
      # To work around this for pinky tests, we create a fake login entry for the GH runner account...
      FAKE_UTMP='[7] [999999] [tty2] [runner] [tty2] [] [0.0.0.0] [2022-02-22T22:22:22,222222+00:00]'
      # ... by dumping the login records, adding our fake line, then reverse dumping ...
      (utmpdump /var/run/utmp ; echo $FAKE_UTMP) | sudo utmpdump -r -o /var/run/utmp
      # ... and add a full name to each account with a gecos field but no full name.
      sudo sed -i 's/:,/:runner name,/' /etc/passwd
      # We also create a couple optional files pinky looks for
      touch /home/runner/.project
      echo "foo" > /home/runner/.plan
      ;;
  esac
  shell: C:\Program Files\Git\bin\bash.EXE --noprofile --norc -e -o pipefail {0}
  env:
    PROJECT_NAME: coreutils
    PROJECT_DESC: Core universal (cross-platform) utilities
    PROJECT_AUTH: uutils
    RUST_MIN_SRV: 1.70.0
    STYLE_FAIL_ON_FAULT: true
    DOCKER_OPTS: --volume /etc/passwd:/etc/passwd --volume /etc/group:/etc/group
    SCCACHE_GHA_ENABLED: true
    RUSTC_WRAPPER: sccache
    CARGO_INCREMENTAL: 0
    CARGO_TERM_COLOR: always
    CARGO_HTTP_MULTIPLEXING: false
    CACHE_ON_FAILURE: false
    SCCACHE_PATH: C:\hostedtoolcache\windows\sccache\0.7.7\x64/sccache
    ACTIONS_CACHE_URL: https://acghubeus1.actions.githubusercontent.com/Tv5Yv5nIzIhWaMo3xdKzjIorSSIlfCwnxr6L52Y7lKBOTZPJ76/
    ACTIONS_RUNTIME_TOKEN: ***
:: Synchronizing package databases...
 clangarm64 downloading...
 mingw32 downloading...
 mingw64 downloading...
 ucrt64 downloading...
 clang32 downloading...
 clang64 downloading...
 msys downloading...
:: Starting core system upgrade...
warning: terminate other MSYS2 programs before proceeding
resolving dependencies...
looking for conflicting packages...

Packages (1) mintty-1~3.7.1-1

Total Download Size:   0.81 MiB
Total Installed Size:  2.37 MiB
Net Upgrade Size:      0.00 MiB

:: Proceed with installation? [Y/n] 
:: Retrieving packages...
 mintty-1~3.7.1-1-x86_64 downloading...
checking keyring...
checking package integrity...
loading package files...
checking for file conflicts...
checking available disk space...
:: Processing package changes...
upgrading mintty...
:: To complete this update all MSYS2 processes including this terminal will be closed. Confirm to proceed [Y/n] 

@piotrkwiecinski
Copy link
Contributor

I read that --noconfirm just skips the question. On linux versions people suggest using yes | pacman.

Something like that should be tested.

yes | C:/msys64/usr/bin/pacman.exe -Syu --needed mingw-w64-x86_64-gcc --noconfirm

I may be able to try it at the end of the week. If no one picks it up.

@BenWiederhake
Copy link
Collaborator Author

Someone seems to have fixed it.

@cre4ture
Copy link
Contributor

cre4ture commented Mar 30, 2024

It occurs again:
https://github.com/uutils/coreutils/actions/runs/8492173478/job/23264907775?pr=6110

:: Processing package changes...
upgrading msys2-runtime...
:: To complete this update all MSYS2 processes including this terminal will be closed. Confirm to proceed [Y/n] 
gcc.exe (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 12.2.0

We should re-open it.
@BenWiederhake @tertsdiepraam

@cre4ture
Copy link
Contributor

cre4ture commented Mar 30, 2024

I found a citation that explains why this issue only appears temporarily and disappears automatically:

"GitHub Actions (GHA) now has MSYS2 installed on their Windows images. Both 64 & 32 bit clang cmake llvm toolchain ragel packages/groups are installed. Custom Actions have also been developed to install/update MSYS2. Of concern is that the script GHA is using will reliably update MSYS2. Currently, the images are updated frequently, typically every 10 days or better." msys2/msys2-installer#5

So I guess that the issue will disappear in <= 10 days as GitHub will then have the new msys2-runtime pre-installed on the image.

I think there are two clean ways to deal with this issue in future:

  1. not upgrading packages and purely rely on GitHub to provide updated images regularily.
  2. using a github actions template: https://github.com/marketplace/actions/setup-msys2 instead of doing custom install commands.

I'm testing nr1. currently here: cre4ture#22
If this works and is accepted, it would be a low effort solution.

@sylvestre
Copy link
Contributor

sounds good, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants