-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: schedule cgo compilation early #15681
Comments
I agree that cgo (and SWIG) can be run immediately, without waiting for any Go files to be compiled. Dmitry's CL for #8893, and your version of it, prove nothing about this one way or the other, as they do not break out the cgo portion of building a package from the rest of building a package. All the cgo support is wrapped up in the same function that compiles the package files: |
We can also parallelize the C compiler invocations within a package which can give a very big compilation speedup for a cgo heavy project. See https://go-review.googlesource.com/#/c/4931/. |
Thanks, @petermattis, I was just trying to find that CL. If I do any serious surgery here, I will see about making individual C compilation fine-grained. |
As mentioned in #16623, I propose the opposite approach: instead of trying to push cmd/cgo earlier, let's push the C compilations later. Currently the only reason we have to wait for the C sources to finish compiling is so we can run cmd/cgo -dynimport and generate a bunch of If cmd/go was responsible for saving the directives instead, we could run cmd/compile immediately after the first cmd/cgo run. Then Go package compilation would never be blocked waiting on C compilations, and C compilations could all run in parallel only blocking any link operations that depend on them. |
C compilation is slow. Don't we want it to be as early as possible, so that it isn't the lone straggler at the end? I agree that it'd be very good to not have to wait for cgo/C to be done to start compiling Go, I just want to make sure we don't push C to the end as a consequence of that. |
I don't mean C compilations need to be delayed per se, but they're trivially parallelizable and nothing fundamentally depends on them except for the linker. On the other hand, Go compilations do necessarily depend on other compilations, so it seems beneficial to prioritize scheduling them to unblock more work. E.g., in your graph for #15734, we have a long bottleneck for package runtime at the beginning, but it looks like we have more than enough idle CPUs later in the build to handle the C compilations. My hypothesis is that if we're able to 1) remove the unnecessary dependency from Go compilations on C compilations, and 2) implement something like #15734; then the Go dependency graph's scheduling delays should essentially flatten, allowing us to naturally schedule C compilations earlier. I suppose what would be really beneficial here is to collect fine-grained trace timing data, and then analyze how much more optimally it could have been scheduled if we relax various dependencies. |
Ack Seems plausible. Definitely worth a run. The cmd/go tracing stuff is near the bottom of my list of pending CLs to get mailed/fixed/submitted, but I will get to them eventually. :) But I think the data about C and cgo is probably clear enough already that we can just move forward with your approach. |
Pinging the thread while I wait here for 31 idle cores to complete a several minute CGo compilation... 🍅 |
@pwaller If you're willing to rebuild your Go toolchain (which is really quite easy), see https://github.com/cockroachdb/cockroach/blob/master/build/parallelbuilds-go1.8.patch. The patch applies cleanly to go1.8 and we use it for development of CockroachDB. |
I played with this a little last night. In particular, I changed The main limiting factor then is that cmd/go's build graph currently uses a single monolithic Action to represent package compilation, but ideally we'd separate cgo compilation into (at least) two Actions:
The actions could probably be further broken down even further (e.g., parallelize the C compilations) or to even more finely-grain the dependencies (e.g., use gcc's -M flag to recognize when individual .o files need to be rebuilt), but that first split is probably the biggest win for most cgo users. |
@petermattis your link is not available. Do you have it in the history anywhere? Also having to wait a lot till one core compiles big C++ library bindings. |
@elgatito https://github.com/cockroachdb/cockroach/blob/aa22f2140f0078ea9c6a43f29a87cf24471ea0e8/build/parallelbuilds-go1.8.patch. We no longer use this patch as we've moved away from using the |
Change https://golang.org/cl/328712 mentions this issue: |
From a quick scan of the code, it appears that cgo does not depend on a package's dependencies having been compiled. Given that that is the case, and given that cgo (and the resulting C compiler invocations) are generally very slow, it is probably worth scheduling all invocations of cmd/cgo at the very beginning of a build. (As a corollary, this might also mean scheduling the compilation of cmd/cgo itself early.)
cc @ianlancetaylor for input about whether cgo invocations can be safely pushed to the head of the queue.
One downside: It is unclear whether this is just a poor man's version of #8893, which didn't appear to yield much fruit. That could use more investigation.
The text was updated successfully, but these errors were encountered: