Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move BuildInfo from sources into resources, defer use of resources during compilation #2425

Merged
merged 41 commits into from
Apr 8, 2023

Conversation

lihaoyi
Copy link
Member

@lihaoyi lihaoyi commented Apr 7, 2023

Instead of compiling the buildinfo values into the source code, we only compile the buildinfo keys into the source code and store the buildinfo values in the resources, to be loaded at runtime. This greatly reduces the amount of time spent compiling things when buildinfo changes. In this repo, some rough tests indicate that it makes the second -i installLocal after a one-line change in the LICENSE file drop from 50s to 14s of runtime

mill-profile.json Before:

mill-profile.txt

mill-profile.json After:

mill-profile.txt

Notes:

  1. To really benefit from this, I had to make compileClasspath depend only on upstream module's compileClasspath()/compile().classes, so it doesn't end up depending on their runtime resources() and forcing unnecessary recompiles when the buildinfo resources change. This is probably a change we want to do anyway, given the direction set by Don't include resources into compileClasspath. #1811

  2. I consolidated the BuildInfo implementation with contrib/buildinfo/.

  3. Extended contrib/buildinfo/ to support Java modules, which we dogfood in main.client.

  4. The old behavior of statically compiling the values into the binary is still available under def buildInfoStaticCompiled = true, for anyone who needs it. That includes anyone using BuildInfo in Scala.js, which doesn't support JVM resources by default

  5. For now, I copy-pasted the implementation into the root build.sc file, and updated mill-bootstrap.patch to remove it and make use of the mill-contrib version. When we re-bootstrap Mill on latest main, we can remove the copy-pasta

  6. contrib/buildinfo/ should be mostly source compatible with the previous implementation, except that I made buildInfoPackageName a required field to encourage the best practice of not putting code in the empty package

Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach. It swaps compiletime work with runtime work. Not really my taste.

I think this approach complicates things. We now do some work at runtime which can fail. To be performant, we also need to cache things a bit, to avoid loading the resources again and again. Also, in the current draft, we loose Scaladoc, but that could be changed, and in general the flexibility to provide more than just key-value pairs.

What I like about this approach. It is still better than using system properties for things, which are not meant to be changed.

@lefou
Copy link
Member

lefou commented Apr 7, 2023

What I wonder. How great is the impact of using generated code that changes with every git change? As we use zinc, which tracks binary changes and our generated classes does not change its API (only the values, which are not part of the ABI), downstream compilation targets should not really be affected. Only other targets like jar or assembly.

@lihaoyi lihaoyi changed the title [WIP] Move BuildInfo dependencies into forkArgs to avoid unnecessary re-compilations [WIP] Move BuildInfo from sources into resources, defer use of resources during compilation Apr 7, 2023
@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 7, 2023

@lefou I updated the PR description with some rough benchmarks running installLocal on this repo. It's a pretty dramatic improvement, from 50s to 14s. You can look at the profiles, all those no-op Zinc compiles take around 100-1000ms each to run and decide to do nothing, all the docJars take a while to re-generate since those are not incremental, and when you have a bunch of modules it adds up. After this PR, the only things it needs to re-build is the final assembly taking ~10s.

My own specific motivation comes when running the .fork/.server example/integration tests on this repo, which needs to publishLocal all the modules before running. Waiting ~1min every time I iterate on an example/integration test is a big annoyance. 14s isn't great either, but it's better, and maybe I can drive that number down further in future

The new mechanism is definitely more complicated, but I'd argue it better reflects the true dependencies of the system:

  • When the buildInfo keys change, then we need to re-compile the world, since we're changing the set of fields we make statically-available on the BuildInfo object
  • When the buildInfo values change, we do not need to re-compile anything, but we do need to re-package the new values for use at runtime

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 7, 2023

Yeah being limited to primitive key-values is a bit annoying, but I don't think it's the end of the world. We can always add back support for more complex data structures with the appropriate code-gen. But there's only one place in com-lihaoyi/mill which uses a Seq[String] rather than a plain String, and contrib/buildinfo/ doesn't support data structures at all, so I think we're ok with dropping that functionality for now

@lihaoyi lihaoyi marked this pull request as ready for review April 7, 2023 12:45
Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, you put each property in a single resource file. Have you considered using java.util.Properties for loading from a single file? This would also reduce complexity.

.map {
case (k, v) =>
if (isScala) s"""val $k = this.readMillBuildInfo("$k")"""
else s"""public static java.lang.String $k = readMillBuildInfo("$k");"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a method, not a field, otherwise the compiler is allowed to inline the value at the call side, which we need to avoid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a method, not a field, otherwise the compiler is allowed to inline the value at the call side, which we need to avoid.

I don't think that happens unless the RHS of the final field is a literal constant or (constant-folded). So in this case, a method call loading a JVM resource, it won't get inlined

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the buildInfoStaticCompiled = true case, we need to make it a method, as the constants are candidates for inlining (https://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.28), but that means we also need to make it a method in the other case, otherwise the generated code is not interchangeable.

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 7, 2023

Currently, you put each property in a single resource file. Have you considered using java.util.Properties for loading from a single file? This would also reduce complexity.

Currently, you put each property in a single resource file. Have you considered using java.util.Properties for loading from a single file? This would also reduce complexity.

I considered it, but some things I liked about separate files:

  1. You don't need to worry about data formats, escaping newlines, separators, etc. With one-file-per-entry, the file contains the literal string, that's it

  2. If multiple people define buildinfo keys in the same package having them in separate files means that they automatically get unioned together when classpaths are combined. This applies to whether they're in separate folders, separate jars, combined into a single jar, etc. all without needing any special configuration (e.g. Assembly.Rule.Append) to make it happen. (This should be mostly avoided by forcing people to provide a packagename, but nonetheless...),

  3. General buildinfo keys are pretty few, like O(10), certainly smaller than the number of classfiles that are in the jar anyway. So having them in separate files won't be too bad w.r.t. perf

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

One thing of note is that I had to add ./mill contrib.buildinfo.publishLocal to ci/patch-mill-bootstrap.sh, now renamed to ci/prepare-mill-bootstrap.sh, since Mill's own build.sc now for the first time is using a contrib plugin. millProjectModule isn't wired up to support contrib plugins, which I think might take a bit more thought to do nicely.

I think that's probably a good thing that we've started dogfooding our own plugins, even if it takes a bit more setup to bootstrap.

@lihaoyi lihaoyi changed the title [WIP] Move BuildInfo from sources into resources, defer use of resources during compilation Move BuildInfo from sources into resources, defer use of resources during compilation Apr 8, 2023
@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

all the feedback so far has been addressed and CI is green(ish)

@lefou
Copy link
Member

lefou commented Apr 8, 2023

One thing of note is that I had to add ./mill contrib.buildinfo.publishLocal to ci/patch-mill-bootstrap.sh, now renamed to ci/prepare-mill-bootstrap.sh, since Mill's own build.sc now for the first time is using a contrib plugin. millProjectModule isn't wired up to support contrib plugins, which I think might take a bit more thought to do nicely.

I think that's probably a good thing that we've started dogfooding our own plugins, even if it takes a bit more setup to bootstrap.

One thought, not a request: Isn't it now possible to directly include the contrib/buildinfo/src/mill/contrib/buildinfo/BuildInfo.scala into build.sc via import $file?

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

It isn't, because import $file assumes a script, which means .sc extension and disallows package declarations. That's unchangeable while we're still using Ammonite

Once we tag a milestone and rebootstrap, we'll be able to configure the mill-build/build.sc to add the .scala to the sources to be compiled without the script code wrapper. So in future we should be able to avoid this copy-paste-patch messiness and just directly share the .scala file between application code and the root build.sc using the meta-build (similar to what a lot of SBT projects do)

import $file still wouldnt work, but we could conceivably add a shortcut import $scalafile or something to add non-script Scala files to the sources without needing to define a meta-build

@lefou
Copy link
Member

lefou commented Apr 8, 2023

I was referring to the bootstrap tests, where we need to publishLocal the plugin, but instead could load it directly as source file.

@lefou
Copy link
Member

lefou commented Apr 8, 2023

In another project we kept .scala symlinks to .sc files, to use plugin code in Mill directly and to publish it as plugin.

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

Oh, for the bootstrap tests, we might be able to. I don't see how import $file would help there, for the same reasons. My understanding is that we'd need to wire them up the same way we do millProjectModules

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

How does the symlinks work with the package declaration st the top of the file? My understanding is that it's a blocker, since it's required to have one for .scala files (assuming we avoid the empty package) and required not to have one for .sc files (since they get wrapped, and package declarations in Scala can only be at the top of files)

Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files needs formatting. Otherwise, this looks good to go. And I'm happy about the outcome. The buildinfo plugin is now in a usable state (Java support and Java/ScalaDoc). Nice work!

@lihaoyi
Copy link
Member Author

lihaoyi commented Apr 8, 2023

Is this how you're meant to auto-format the files? It's the command you gave inhttps://github.com/com-lihaoyi/os-lib/pull/160, but seems to give me in an error

lihaoyi mill$ ./mill __.reformat
Cannot resolve __.reformat. Try `mill resolve _` to see what's available.

@lihaoyi lihaoyi merged commit d9cf409 into com-lihaoyi:main Apr 8, 2023
@lefou
Copy link
Member

lefou commented Apr 10, 2023

@lefou lefou added this to the 0.11.0-M8 milestone Apr 10, 2023
lihaoyi added a commit that referenced this pull request Apr 15, 2023
…2435)

This should allow us to do debugging or iteration on example/integration
in `.local` mode, which is a lot faster than `.fork` or `.server`

Even after #2425 landed, the
`.local` integration tests are still much faster to run than `.fork` and
`.server`, with publishing taking ~20s down from ~60s before which is
still a very annoying wait. Better if we can avoid that, using `.local`
in the common development workflow and leaving `.fork` and `.server` for
CI to catch rare edge-case bugs

We basically needed to be a bit more robust in spawning subprocesses,
allowing the `InputPumper` machinery to work for any redirected streams
rather than being hardcoded to work with the `PipedInputStream` that
`MillServerMain` uses.

I had to fix an issue in `ScalaJSModule#run` to make
`example.web[3-scalajs-module].local.test` pass with this additional
enforcement. It appears we were not properly pumping the JSEnv
subprocess stdout/stderr. This would cause the outputs to disappear when
run repeatedly in client-server mode, since the output would be
inherited by first MillClient process that spawned the server, and not
subsequent clients that simply connect to it

---------

Co-authored-by: Tobias Roeser <le.petit.fou@web.de>
@lolgab
Copy link
Member

lolgab commented Jun 1, 2023

This broke my Scala.js build.
Can we set the default so ScalaJSModules and ScalaNativeModules always gets def buildInfoStaticCompiled = true?

lefou added a commit to lefou/mill-osgi that referenced this pull request Jun 24, 2023
This adds support for Mill API 0.11.

In Mill 0.11.0-M8, some internals have changed
(com-lihaoyi/mill#2425), so the consumption of
transitive module dependencies had to be changed to continue to generate
proper OSGi Manifests. This feels a bit fragile in the sense, that if
the customized targets and their interactions change upstream, we might
need to follow these changes in this plugin.

Pull request: #118
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants