Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ship even more of the compiler in source form #19063

Open
7 of 14 tasks
andrewrk opened this issue Feb 24, 2024 · 10 comments
Open
7 of 14 tasks

ship even more of the compiler in source form #19063

andrewrk opened this issue Feb 24, 2024 · 10 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Feb 24, 2024

You know what's really pleasant to work on? The build system. The zig build system is shipped in source form, and when you run zig build, zig compiles the build runner from source, and then runs it. In order to do that, the compiler needs only 2 capabilities: compiling source into a native executable, and executing child processes.

Here are some components that could be shipped in source form rather than compiled into the zig binary:

  • zig fmt
  • zig reduce
  • translate-c
  • aro
  • package fetching
  • Autodoc
  • building glibc stubs
  • building libc++, libcxxabi, libtsan, libunwind
  • building musl, mingw, and wasi-libc
  • libc installation detection
  • msvc installation detection
  • objcopy
  • resinator
  • all the machine code backends other than the one used to produce native executables separate proposal

All of these components have simple interfaces that could be communicated over IPC or the file system. I admit however that last one is a bit spicy.

Implementing this issue would accomplish the following things:

  • Significantly decrease compiler build times
  • Make it easier to contribute to zig because no compiler rebuild would be needed in order to test edits to any of these components.
  • Smaller installation size. Source code is surprisingly compact and compresses extraordinarily well. For example, the xz-compressed zig compiler executable built with -Denable-llvm=false -Doptimize=ReleaseSmall on x86_64 is 2.9 MiB (9.9 MiB uncompressed), while the compiler source code catted together into an xz stream is 1.6 MiB.
  • The lazily built executables can take full advantage of native CPU features. We can have AOT cake and eat JIT too!
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Feb 24, 2024
@andrewrk andrewrk added this to the 0.13.0 milestone Feb 24, 2024
@VisenDev
Copy link

Possibly related proposal

#18834.

This proposal would allow other zig projects to ship zig source code rather than compiled binaries as well.

@der-teufel-programming
Copy link
Contributor

Shipping current version of Autodoc in source form would require also shipping ZIR and parts of InternPool

@mikdusan
Copy link
Member

mikdusan commented Feb 25, 2024

I like this. It's always irked me that we have things like src/musl.zig that has sort of poor-mans build logic when the build system already exists.

@nektro
Copy link
Contributor

nektro commented Feb 25, 2024

because that logic uses the cache but needs to be callable by build-exe etc. making them use the actual build system is a good idea imo but will require a medium amount of work.

@nektro
Copy link
Contributor

nektro commented Feb 25, 2024

does this proposal being implemented mean for example that zig reduce would start shipping as zig-reduce and/or the zig binary should check PATH for binaries to ensure the space version still works?

@andrewrk
Copy link
Member Author

does this proposal being implemented mean for example that zig reduce would start shipping as zig-reduce and/or the zig binary should check PATH for binaries to ensure the space version still works?

No. It would be the same CLI.

@jayrod246
Copy link
Contributor

So same CLI, and it would re-build the relevant source files if necessary - like how zig build works. Nice!

andrewrk added a commit that referenced this issue Feb 27, 2024
@andrewrk andrewrk added the accepted This proposal is planned. label Feb 27, 2024
andrewrk added a commit that referenced this issue Feb 27, 2024
andrewrk added a commit that referenced this issue Feb 28, 2024
part of #19063

This is a prerequisite for doing the same for Resinator.
andrewrk added a commit that referenced this issue Feb 28, 2024
part of #19063

This is a prerequisite for doing the same for Resinator.
andrewrk added a commit that referenced this issue Feb 28, 2024
part of #19063

This is a prerequisite for doing the same for Resinator.
andrewrk added a commit that referenced this issue Feb 28, 2024
Part of #19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
andrewrk added a commit that referenced this issue Feb 28, 2024
Part of #19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
andrewrk added a commit that referenced this issue Feb 28, 2024
Part of #19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Feb 29, 2024
Part of ziglang#19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
@paperdev-code
Copy link

all the machine code backends other than the one used to produce native executables

This could be very interesting, I imagine specifying a 3rd party package containing a machine backend I need through my build.zig.zon, which could be very good for embedded projects.

andrewrk added a commit that referenced this issue Mar 2, 2024
andrewrk added a commit that referenced this issue Mar 3, 2024
squeek502 added a commit to squeek502/zig that referenced this issue Mar 4, 2024
This moves .rc/.manifest compilation out of the main Zig binary, contributing towards ziglang#19063

Also:
- Make resinator use Aro as its preprocessor instead of clang
- Sync resinator with upstream
squeek502 added a commit to squeek502/zig that referenced this issue Mar 7, 2024
This moves .rc/.manifest compilation out of the main Zig binary, contributing towards ziglang#19063

Also:
- Make resinator use Aro as its preprocessor instead of clang
- Sync resinator with upstream
squeek502 added a commit to squeek502/zig that referenced this issue Mar 11, 2024
This moves .rc/.manifest compilation out of the main Zig binary, contributing towards ziglang#19063

Also:
- Make resinator use Aro as its preprocessor instead of clang
- Sync resinator with upstream
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
part of ziglang#19063

This is a prerequisite for doing the same for Resinator.
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
Part of ziglang#19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
RossComputerGuy pushed a commit to ExpidusOS-archive/zig that referenced this issue Mar 20, 2024
This moves .rc/.manifest compilation out of the main Zig binary, contributing towards ziglang#19063

Also:
- Make resinator use Aro as its preprocessor instead of clang
- Sync resinator with upstream
@tecanec
Copy link
Contributor

tecanec commented Apr 3, 2024

You know what's really pleasant to work on? The build system.

Well, that explains it...

The lazily built executables can take full advantage of native CPU features. We can have AOT cake and eat JIT too!

This is just speculation, but I feel like JIT would synergize really well with comptime. Especially for things like std.simd.suggestVectorLength, whose goal is specifically to automatically optimize for the given hardware.

all the machine code backends other than the one used to produce native executables

This could be very interesting, I imagine specifying a 3rd party package containing a machine backend I need through my build.zig.zon, which could be very good for embedded projects.

I was thinking something similar. Custom backends sound like a big deal for non-mainstream platforms, including proprietary, niche, or even in-development architectures, including experiments and pet projects.

Overall, I am a bit concerned about compile times, though. I know caching is a thing, but it's not perfect.

@ghost
Copy link

ghost commented Apr 29, 2024

I'm half-Indian so I think that last bullet could use a bit more heat.

An observation: whatever gets compiled on-demand will have native opt, but whatever we ship as binary won't -- and if one of those things is the compiler itself, then compiling everything on-demand may incur a noticeable delay.

A solution: ship the entire project as source, and provide only a single, tiny binary: the compiler, itself compiled for the absolute bare-bones no-extensions version of the host, with only the host platform backend, but capable of outputting code using all possible features. Upon install, this binary detects the host's features, compiles a copy of itself using native opt, then deletes itself and hands control to the copy.

Now obviously this is pretty wacky: infeasible for the moment, annoying for the first time installer. When it becomes possible though, it will in fact make up the lost time the very first time a serious project is compiled. Also it's just cool to not have a single bit of externally-compiled code in the installed package.

||don't worry, i'm not sticking around. just not quite excised from my own head yet.||

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

8 participants