Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows unicode support #666

Closed
wants to merge 8 commits into from
Closed

windows unicode support #666

wants to merge 8 commits into from

Conversation

lovettchris
Copy link
Contributor

Reading unicode files on Windows doesn't work:

PS C:\Users\clovett> C:\Users\clovett\.elan\bin\lean.exe "D:\temp\foo\英語\bar\Help.lean"
file 'D:\temp\foo\??\bar\Help.lean' not found

@Kha
Copy link
Member

Kha commented Sep 14, 2021

I believe we could set the process code page to UTF-8 to simplify cross-platform code: https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

@lovettchris
Copy link
Contributor Author

Ah, interesting, so we must have been doing something wrong because this wasn't working. I'll investigate this approach - it could be a heck of a lot easier :-) But they say it only works on Windows 10 1903 and above. Presumably you are ok with that restriction.

@Kha
Copy link
Member

Kha commented Sep 14, 2021

Presumably you are ok with that restriction.

We should ask @leodemoura as well, but if it significantly simplifies the code, I definitely would be :) .

One caveat is that we would have to add the manifest to user-defined Lean executables as well, though perhaps that is a good thing to do for modern Windows binaries in general? For example, we might also want to activate longPathAware.

@lovettchris
Copy link
Contributor Author

note: my "legacy api" version of unicode support has given me an interesting tour through the build system :-)

@leodemoura
Copy link
Member

We should ask @leodemoura as well, but if it significantly simplifies the code, I definitely would be :) .

I am happy too :)

@lovettchris
Copy link
Contributor Author

I looked into it and found the following:

  1. MSYS2 links C:\msys64\mingw64\x86_64-w64-mingw32\lib\default-manifest.o into all apps that it builds.
  2. They provide no way to change that (which lots of people have complained about).
  3. See unresolved issue: *windows-default-manifest package issues msys2/MSYS2-packages#454
  4. Other languages seem to be working around this by creating their own coff editing libraries, like this Go solution (see the embedded manifest in that file).

But I did a quick proof of concept hacking lean.exe using a visual studio tool "mt.exe" as follows:

mt -manifest app.manifest -inputresource:lean.exe;#1 -out:test.xml
mt -manifest test.xml -canonicalize -out:out.xml
mt -manifest out.xml -outputresource:lean.exe;#1

The resulting executable can load a file containing a unicode path like D:\temp\foo\英語\bar\Hello.lean

But the question is how to integrate this into the process...?

I'm also a bit concerned because after editing the file like this, dumpbin lean.exe then complains:

File Type: EXECUTABLE IMAGE
lean.exe : fatal error LNK1106: invalid file or disk full: cannot seek to 0x3479A

@Kha
Copy link
Member

Kha commented Sep 14, 2021

Phew! That doesn't sound pleasant. Is this GCC-only though? We used to use clang on all platforms until we recently had to switch to GCC on Windows because of a miscompilation. But I just found out that -fuse-ld=lld seems to fix this, because that combination (clang+lld) is what Zig uses.

@lovettchris
Copy link
Contributor Author

See new simpler solution: #668

@lovettchris
Copy link
Contributor Author

PS: the mt.exe breaking the .exe is a real problem, so the new solution does not use mt.exe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants