Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix msys2 windows build so the windows apps support utf-8 file paths. #668

Merged
merged 18 commits into from
Sep 22, 2021
Merged

fix msys2 windows build so the windows apps support utf-8 file paths. #668

merged 18 commits into from
Sep 22, 2021

Conversation

lovettchris
Copy link
Contributor

@lovettchris lovettchris commented Sep 15, 2021

fix: utf-8 file paths on Windows.

You can now do this on windows:

lean.exe --run D:\temp\foo\英語\bar\Hello.lean the quick brown 英語
[the, quick, brown, 英語]

Notice that the file path can contain unicode chars and the arguments can be passed as utf-8.

Without this fix you get this:

lean.exe --run D:\temp\foo\英語\bar\Hello.lean the quick brown 英語
file 'D:\temp\foo\??\bar\Hello.lean' not found

where Hello.lean contains

#lang lean4
def main (n : List String) : IO UInt32 := do
    IO.println (toString n)
    pure 0

@Kha
Copy link
Member

Kha commented Sep 15, 2021

Ahh, much simpler :) . Modifying the MSYS2 installation during build doesn't sound very "nice" though. Should we perhaps put it in the setup instructions/ci.yml instead? And we still need a solution for clang/Zig (I'd be fine with having an (automatic) solution for clang only).

@lovettchris
Copy link
Contributor Author

lovettchris commented Sep 15, 2021

Yeah, what I'd like to be able to do is modify src/shell/CMakeLists.txt like this:

add_library(shell OBJECT lean.cpp lean.rc)

this works, it does invoke windres and compile the app manifest resource, and it tries to link it but then I get this compile error:

bash: line 1: C:/msys64/home/clovett/git/lean4/build/release/stage0/shell/CMakeFiles/shell.dir/lean.rc.obj: 
cannot execute binary file: Exec format error

Even though the resulting lean.rc.obj is byte for byte identical to the default-manifest.o.

So it seems I'm hitting msys issue #454 which claims to have been fixed in a gcc-patch, but I'm not seeing it.

That's why I'm replacing the system installed default-manifest.o. And I want to do it during build and not in the github workflow because I want it to also work for devs setting up their own msys environment without adding any addition faffing about.

The cannot execute binary file: Exec format error seems to be a bug in our build system, this is coming from stdlib.make. I do not get this error in a little cmake test app. In a little cmake test app I get .rsrc merge failure: multiple non-default manifests, which I reported to msys guys in a new issue msys2/MINGW-packages#9603 (comment).

In my little cmake test app clang builds work fine and it links my custom lean.rc.obj no problem. So if we can fix the bug in stdlib.make (or is it in leanc) then switching to clang solves the default-manifest.o problem...

@Kha
Copy link
Member

Kha commented Sep 16, 2021

The cannot execute binary file: Exec format error seems to be a bug in our build system, this is coming from stdlib.make.

You can look at the executed cmdline using make ... VERBOSE=1. CMake is likely concatenating the objects from shell using ;, we would have to replace those with spaces like here before passing it to leanc here.

I do not get this error in a little cmake test app. In a little cmake test app I get .rsrc merge failure: multiple non-default manifests, which I reported to msys guys in a new issue msys2/MINGW-packages#9603 (comment).

Is this with clang? The cmdline you posted in that issue does not specify a compiler.

@lovettchris
Copy link
Contributor Author

Cool, also making progress here: msys2/MINGW-packages#9603 (comment) so it seems we may be able to get updated version of msys that will remove the requirement on patching the default-manifest.o file... So I won't merge this until I get that resolved.

@lovettchris
Copy link
Contributor Author

Ok, VERBOSE=1 helps, this is the log output, we see it is invoking "leanmake"... which I think is in turn invoking "lean.mk"...?

make[6]: Entering directory '/home/clovett/git/lean4/src'
"/C/msys64/home/clovett/git/lean4/build/release/stage1/bin/leanmake" bin PKG=Leanpkg BIN_NAME=leanpkg.exe LEAN=" C:/msys64/home/clovett/git/lean4/build/release/stage0/bin/lean.exe" OUT="../build/release/stage1/lib" LIB_OUT="../build/release/stage1/lib/lean" OLEAN_OUT="../build/release/stage1/lib/lean" BIN_OUT="C:/msys64/home/clovett/git/lean4/build/release/stage1/bin" LEAN_OPTS+="" LEANC_OPTS+=" -O3 -DNDEBUG" LEAN_AR="C:/msys64/mingw64/bin/ar.exe" MORE_DEPS+="C:/msys64/home/clovett/git/lean4/build/release/stage0/bin/lean.exe" CMAKE_LIKE_OUTPUT=1 LINK_OPTS='-lleanshared  -pthread'
bash: line 1: C:/msys64/home/clovett/git/lean4/build/release/stage1/shell/CMakeFiles/shell.dir/manifest.rc.obj: cannot execute binary file: Exec format error

@lovettchris
Copy link
Contributor Author

Ok, I pushed new changes to stop modifying default-manifest.o and this part of the build fails eroneously:

cd C:/msys64/home/clovett/git/lean4/stage0/src && make -f C:/msys64/home/clovett/git/lean4/build/release/stage0/stdlib.make lean "LEAN_SHELL=C:/msys64/home/clovett/git/lean4/build/release/stage0/shell/CMakeFiles/shell.dir/lean.cpp.obj;C:/msys64/home/clovett/git/lean4/build/release/stage0/shell/CMakeFiles/shell.dir/manifest.rc.obj"

saying:

bash: line 1: C:/msys64/home/clovett/git/lean4/build/release/stage0/shell/CMakeFiles/shell.dir/manifest.rc.obj: cannot execute binary file: Exec format error

something about stdlib.make is getting confused by the contents of manifest.rc.obj.

@Kha
Copy link
Member

Kha commented Sep 17, 2021

Yes, you need to replace the ; as described above. I thought VERBOSE=1 would also show the commands of the inner Makefile, but apparently it doesn't.

@Kha
Copy link
Member

Kha commented Sep 17, 2021

Btw, you shouldn't need to change stage0/ here. It has its own cmake files, so as long as a change isn't needed in the Lean executable that builds stage 1, it is sufficient to change src/ only.

@lovettchris
Copy link
Contributor Author

Btw, you shouldn't need to change stage0/ here. It has its own cmake files, so as long as a change isn't needed in the Lean executable that builds stage 1, it is sufficient to change src/ only.

well theoretically someone could clone "lean" into a unicode path, then all hell would break loose in msys when using the stage0 version of lean.exe.

@leodemoura
Copy link
Member

Btw, you shouldn't need to change stage0/ here. It has its own cmake files, so as long as a change isn't needed in the Lean executable that builds stage 1, it is sufficient to change src/ only.

well theoretically someone could clone "lean" into a unicode path, then all hell would break loose in msys when using the stage0 version of lean.exe.

@lovettchris We try to avoid updates to stage0 in PR's since they require extra work when merging. For example, we often have to manually rebase, build, and replay all stage0 updates in a PR because they are out of sync with other changes made in the master branch.

I am not concerned about the scenario you have described. It will be corrected later when we perform the regular update stage0 operations.

@lovettchris
Copy link
Contributor Author

Ah, I missed your semicolon fix before, I found another way to fix that using COMMAND_EXPAND_LISTS.

@lovettchris
Copy link
Contributor Author

We try to avoid updates to stage0 in PR's since they require extra work when merging.

so that's what I was wondering, what is the process for updating stage0? Do we have to keep it in sync always, or is it done in separate PR's. I've definitely seen PR's that do both in one PR. Not sure how I see me keeping it mirrored causes extra work though?

@Kha
Copy link
Member

Kha commented Sep 17, 2021

Ah, I missed your semicolon fix before, I found another way to fix that using COMMAND_EXPAND_LISTS.

Ooh, nice!

@Kha
Copy link
Member

Kha commented Sep 17, 2021

You should only update stage 0 in a PR if it is necessary for the PR to build. All other changes will find their way automatically into stage 0 eventually after merging (usually in a matter of days) when it is updated for some other change.

@lovettchris
Copy link
Contributor Author

lovettchris commented Sep 17, 2021

You should only update stage 0 in a PR if it is necessary for the PR to build. All other changes will find their way automatically into stage 0 eventually after merging (usually in a matter of days) when it is updated for some other change.

ok, I undid changes to stage0 although I still don't see how that creates merging problems unless there are pending pull requests that are changing those exact same lines in the CMakeFIles.txt file. git merging is typically pretty good. I understand you can't have the makefiles diverge because that makes updating stage0 harder, but I was not doing that I was mirroring the files just like you would in a bigger stage0 update... am I missing something about git here?

@Kha
Copy link
Member

Kha commented Sep 17, 2021

Manually copying stuff into stage 0 should already be considered an advanced operation, the intended way is to always go through update-stage0 (which can update arbitrary files and thus increase the chance of merge conflicts dramatically). The last time I copied src/CMakeLists.txt into stage0/ manually, it broke because I needed to copy other build files changed in a previous commit as well. Manually copying single lines should usually work if one is careful, but that just seems like unnecessary hassle if there is no need to do that in a PR, and something that can easily be forgotten on further changes.

@Kha
Copy link
Member

Kha commented Sep 17, 2021

Basically stage0/ should be considered a single binary blob, which in fact is the case in other bootstrapping schemes, and which we tell Git and GitHub to do so in the .gitattributes file.

@lovettchris
Copy link
Contributor Author

Ok got it, it is just "being careful" process then, sounds good. We should write this down someplace... do we have docs for team members somewhere?

@leodemoura
Copy link
Member

do we have docs for team members somewhere?

We have some documentation here https://github.com/leanprover/lean4/blob/master/doc/make/index.md, but we definitely need more.

@lovettchris
Copy link
Contributor Author

do we have docs for team members somewhere?

We have some documentation here https://github.com/leanprover/lean4/blob/master/doc/make/index.md, but we definitely need more.

Maybe I should start a 'dev' folder that includes coding guidelines and tips & tricks, etc... make is only one part of the overall content we need there... but having a "developer guide" is common for projects like this...

@leodemoura
Copy link
Member

do we have docs for team members somewhere?

We have some documentation here https://github.com/leanprover/lean4/blob/master/doc/make/index.md, but we definitely need more.

Maybe I should start a 'dev' folder that includes coding guidelines and tips & tricks, etc... make is only one part of the overall content we need there... but having a "developer guide" is common for projects like this...

Yes, a doc/dev folder is a great idea.

@lovettchris
Copy link
Contributor Author

I'll do the 'dev' doc updates in a separate PR. This PR is ready for approval.

@Kha
Copy link
Member

Kha commented Sep 20, 2021

Hmm, did the COMMAND_EXPAND_LISTS change break leanrt?

@Kha
Copy link
Member

Kha commented Sep 20, 2021

Could you also add a test case, e.g. adding output to a Unicode file path to tests/lean/run/IO_test.lean?

@Kha
Copy link
Member

Kha commented Sep 20, 2021

We should also make sure leanc builds Lean programs in the same way, but that doesn't have to be in this PR

@lovettchris
Copy link
Contributor Author

Could you also add a test case, e.g. adding output to a Unicode file path to tests/lean/run/IO_test.lean?

good idea, will do.

@lovettchris
Copy link
Contributor Author

Hmm, did the COMMAND_EXPAND_LISTS change break leanrt?

I guess so, I was afraid of that as there was a merge conflict on this line of the cmake.

@Kha Kha merged commit ad7c5b2 into leanprover:master Sep 22, 2021
@lovettchris lovettchris deleted the clovett/windows-utf8 branch November 3, 2021 10:01
ChrisHughes24 pushed a commit to ChrisHughes24/lean4 that referenced this pull request Dec 2, 2022
Tracking mathlib commit: [39af7d3bf61a98e928812dbc3e16f4ea8b795ca3](leanprover-community/mathlib3@39af7d3)

Co-authored-by: Scott Morrison <scott.morrison@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants