-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Gradle Configuration Cache #644
Comments
I took a quick look at this. The first stumbling block is that we use spotless/plugin-gradle/src/main/java/com/diffplug/gradle/spotless/RegisterDependenciesTask.java Lines 94 to 100 in 41ed32d
Googling |
We're pretty close to being able to ship full support for this. But it's a bit nasty as-implemented, and it could be clean if there were some clarity about how
Questionable hackOne of Spotless' features is that you can specify a function in your buildscript, and use that function as a formatter, e.g.: spotless { format 'blah', {
custom 'lowercase', { str -> str.toLowerCase() } However, that lambda is not serializable, so IIUC we'll never be able to support that fully with the traditional configuration cache model of "serialize every task input, execute based on the deserialization of those inputs". However, if there is some kind of
This nasty hack works, but it only work if the following holds true:
IMO, the ability to specify arbitrary functions in a buildscript is a defining value-add of Gradle vs other systems. The configuration cache feature is fantastic, but if I have to give up those arbitrary functions (in this and other plugins), then the caching will always be a second-class feature. To be a first-class, always-works feature, it seems like there needs to be some way to take live objects from the caching, and store them for future invocations. Interplay between Configuration Cache and BuildServiceIf you use I think it would be a lot clearer, for both public <T> T ObjectFactory.onePerBuild(Class<T> clazz)
public <T> T ObjectFactory.onePerConfigCache(Class<T> clazz) IMO it's a lot easier to test things when I'm not forced to make them abstract (not a fan of needing to leave methods abstract), and the contract of what Gradle is providing me with is clearer than Overloaded "configuration"I get confused by "configuration" the set of dependencies, versus "configuration" the process of executing build scripts and configuring tasks. Also configure is long to type. If I could ask for a breaking change in Gradle, it would be to rename executing build scripts to "setup" - "setup avoidance", "setup-cache". Maybe "wiring" or "init"? Just something unique, and shorter. |
@bigdaz are you guys looking for feedback on configuration cache? Super awesome feature, I got it mostly working in Spotless with some hacks, my experience is in the comment above. |
Oh wow, the code in #721 is going to be... hard to maintain, to put it diplomatically. If there's anything we could do to simplify it, that would be a big win for me. So I'm tempted to wait until we either get some acknowledgement or feedback from the Gradle team or until the configuration cache feature has been developed some more. In the meantime, I'm happy for #720 to be merged in. On a quick scan, it looks relatively simple. |
I agree we should wait for feedback from Gradle folks, and/or more clarity in the docs around it. Merging #720 would make us depend on |
I don't have much insight into the Configuration Cache, so I'm not going to be much help. I'll mention @eskatos here since he can likely help. The best way to have a conversation about this would be on the Gradle Community Slack in the #configuration-cache channel. |
I left a comment about the usage of build services for your use case on #720. About the formater as a function use case. The user function can be given as an explicit type, an anonymous class, a Java lambda, a Groovy closure, a Kotlin lambda etc... The configuration cache supports serializing some of those already but not all of them in all cases. It shouldn't be a plugin concern though. You can keep your existing API and let users that want to use the configuration cache pass the formater function in a way that is currently supported. Eventually the configuration cache will support more cases. The nasty hack described above using static state won't fly. The configuration cache can be reused by fresh daemons that won't have the static state initialized. |
Roger. That's the nail in the coffin for our existing design, thanks for the clarification. It seems like there's a design space of increasing complexity which looks like this:
Did you explore option 3 and reject it? It seems like option 3 ought to be faster than 4 or 5 can be, and it's also much easier to implement in some cases because it doesn't require serialization support. The improvements you guys are making are bringing Gradle's end-to-end runtime out of "build tool" territory and into "fully interactive", which I think is really exciting. The faster Gradle gets, the less important multiple daemons on a single project becomes. Even at its current speed, my usage model almost never requires a second daemon. If it's possible to make "configuration cache" faster for 90% of gradle invocations, at the cost that it only persists across invocations on a single daemon, that seems like a better tradeoff than making it a little slower, with the benefit that it can be serialized to disk and sent to other machines. Especially given that local & remote buildcache already exist with broad support, it seems like JVM-local caching is the missing hole to allow next-level interactive performance from Gradle. If that hole is ever going to be filled, then I think Spotless will wait for that. And if it's not ever going to be filled, why not? |
Those are very good points. We indeed are considering keeping the cached state warm in the daemon for the next invocation. But we also want to be able to share it between machines. 3/ might not happen soon though. That being said I don't see why you would be blocked until this happen. |
You are absolutely right that we can implement something which is compatible with option 4 and 5, and doing so would be nice. But long ago, back in Gradle 2.x, we made the choice to abuse serialization to implement equality. If we want to ignore a field for the purpose of equality (such as a local path), we mark it To support option 3 is trivial for us. To support option 4 or 5 is pretty close to a rewrite, and at the end we lose functionality because we're constrained by which things are serializable, and we're a little bit slower than we are with option 3. If Spotless users had option 3 (plus the build cache they already have), I don't think they would mind very much if we didn't support 4 or 5.
The configuration cache is a brilliant idea, but it seems like it's being approached backwards, fighting the last battle (buildcache), and long-term restricts what plugins can do. My complaint isn't just that random legacy decisions make it hard in Spotless, it's that my favorite gradle feature is that plugins can accept a function as a build input. I use it all over the place, not just Spotless, it makes my plugins shorter to write, and more powerful and flexible to use. I love it when other plugins provide function-based APIs too. Configuration cache 3 keeps these as first class citizens, and is faster, and puts a lower cognitive overhead on plugin developers. "Your task inputs stay in memory". If the per-build performance tradeoff of "your task inputs stay in memory" hasn't been compared against the current approach, then I would start there. It's also so simple to implement and document, relative to serializing random lambdas which might capture who knows what. I don't see how that can be done without huge provisos, so I don't see how Gradle doesn't lose "functions as parameters". I don't want "Bazel but with Groovy", I want Gradle!! |
Plugins can still accept functions. It's just that those functions must be serializable. We don't want to lose the convenience. It already works with Groovy closures and Kotlin lambdas. That covers most of the cases. Java lambdas can be problematic though, and this is a problem since forever, not only for the configuration cache, see e.g. gradle/gradle#10751 |
Thanks for taking the time to dig up the unit test, I don't know how you did it but you did it! ;-) I think there's a userland way we can hack option 3 ourselves (cleaner than the hack I proposed before, and we can make it opt-in). It allows us to do our rewrite piece-by-piece, rather than all at once. You've got more data on how people use Gradle than I do, and I'm on a small team so central caching is less important to me than to your customers. I'm always impressed by the Gradle team, thanks for helping us understand the APIs. |
Happy to help, and thank you for writing up your problems and concerns so clearly. It's so easy to help when things are crystal clear! I'm looking forward to a release of the spotless plugin that enables using the configuration cache. |
@nedtwigg Are there any updates on configuration cache support? |
Hoping to take a shot at it in a couple months. Happy to take a PR jf anybody beats me to it :) |
@nedtwigg we realised that spotless failed the build when configuration cache is enabled with the following message:
is that something you have on your agenda? |
I have been using configuration-cache for a few months now, and it is a nice feature. I use configuration cache for good performance on all testing and packaging tasks, and then I use So I'm happy to merge a PR, but for the workflows I have, it is not a limiting factor. In particular, imo the Gradle design has picked the wrong design point (more details here). If your design goal is a machine-independent cache, that comes with serialization constraints and serialization overhead. If your design goal is single-user low-latency interactivity, then a JVM-local cache removes the design and performance limitations imposed by serialization. Personally, if I were going to put time into configuration cache, I would first build out JVM-local caching, which would be much faster, and also much easier to implement. Regardless, happy to merge a PR :) |
I've got a line of PRs setup which lets configuration-cache run.
You quickly get cryptic errors, so I won't merge and release until we can safely flag which parts of Spotless do and don't support configuration cache. But at least the heavy lifting is now done. I expect |
Spotless is working well with configuration-cache in my testing, but limited to a single daemon. Fair warning @eskatos, our users are going to see this error message anytime they use configuration-cache from a different daemon than the one that wrote the cache:
And in #987, I make the case I made above that configuration-cache ought to design for minimum time-to-iteration, and that serialization round-trip is an unnecessary cost, and especially that relocatability is going to be difficult. But for common workflows, Spotless is soon to be configuration-cache happy (leaving the PR's to simmer for another day or two before releasing |
Support for configuration-cache is now available in |
Hi, even after updating to 6.0.3, Android studio reports spotless as not compatible with configuration cache Any hint? We use Can i help by providing any logs when built from console? This is reported when building Thanks a lot! |
I don't know anything about Android Studio, I just know that I'm using Spotless |
No worries, probably others will have the same issue 👍 |
@nedtwigg, we get lots of reports of users thrown off by the way this is currently working. The fact that it works differently than any other Gradle plugin is surprising. One needs to make a manual action on some "internal" files on disk to continue its workflow when a Gradle daemon expire or when starting a new daemon because the other one is busy or incompatible (e.g. cli vs IDE). I understand keeping your static cache is important performance wise. I also understand that JVM-local only configuration cache would make this easier for you, but it isn't planned. Could there be a way for the Spotless plugin to automatically rebuild that cache on fresh daemons loading from the configuration cache? |
The real blocker for us is that we cannot "rehydrate" ourselves - we do not support round-trip serialization. The reason we don't is that we abused the Java serialization mechanism to quickly implement equality checks, using I do think the Gradle configuration cache is designed to the wrong usecase (too slow for users, too many constraints for build developers), but I would happily submit to the ecosystem if I had time to rewrite our code. The workaround we are using right now is a good performance choice, but we aren't doing it for performance, it's just the easiest way we could implement support for round-trip serialization. It works well enough that no one has allocated enough of their time to submit a PR redoing all of the equality / serialization code for every formatter we support.
Yes! We want to work easily with the ecosystem, and we have an issue to automatically do the Even better for us would be if there was TL;DR
IMO, a JVM-local cache "devcache" is the best long-term solution, but I understand that it is not planned. There are things we can do ourselves at Spotless, and we're happy to take a PR from anyone that accomplishes this, but it has apparently not been worth the cost/benefit to anyone so far. FWIW, we just did some work to fix a warning about |
Hi Ned, Just to clarify, I was not suggesting to rehydrate when using a different daemon but to recalculate instead. In principle, make your static cache transient and recalculate when it is not available. I don't think #1209 is a good idea. It would just automate something wrong from the Gradle pov: fiddling with internal state files and nuking all cache entries that are for task graphs unrelated to Spotless. |
If we can throw Here is the state that defines how to run the "Black" python formatter (note that it does not implement hashCode or equals).
We have transient fields which model how to call the black formatter, and the only non-transient field is When Spotless gets loaded in a new daemon, we know what version to use, but we don't have values for any of the transient fields. To fix this, we have to rewrite the "State" class, so that two states with different serialized forms will still At the end of all this work, we'll fully support the Gradle cache model, but we'll be a little slower, and adding a new formatter will be a little bit more complicated than it used to be, because now each step has to support round-trip serialization, and we have to really audit the Round-trip serialization is a big constraint to put on a system. "Normalized form" is a nice trick for defining equality and serialization in one shot, if you're okay with giving-up roundtrip. It's a trick we lose when we implement round-trip serialization. |
I meant to make your whole cache transient wrt. the configuration cache. I'm not sure I recall correctly how you use Java serialization internally. As a heads up, gradle/gradle#14624 will be implemented relatively soon. It means that the serialization roundtrip will be done on the first invocation. This might make the current Spotless workaround not to work at all anymore. [EDIT] well, it might still work the same way, I'm not sure about your implementation. But you might want to consider the fact that this is the way Gradle will end up working. |
Our hack lets us do serialization roundtrips within a single JVM (by not actually doing the serialization). When Gradle requires multiple JVMs to do a single build, our approach will break, but until then we will keep working. |
Hi Ned, Gradle does require multiple JVMs: the Gradle daemon can expire. There are many use cases for which it happens: not running a build for 3h, going back home to get some sleep and working again in the morning etc.. :) The configuration cache should get promoted to a stable feature by the end of this year. It will eventually be made the default, tentatively a year after the promotion to stable. Many folks are already using the configuration cache and the current workflow with Spotless is creating friction (e.g. folks wondering if it should be recommended). The promotion to stable will bring a lot more users. Static state in plugins has been discouraged for ages. IIUC, currently the static state of the Spotless plugin cannot be recreated by a new Gradle daemon with the configuration cache enabled. If the state is kept static, couldn't the cache population be moved at execution time to a task (that can be up-to-date)? Otherwise, Gradle offers means to declare logic that needs to happen even when the configuration phase is skipped: shared build services. This might help. I guess what I'm after is an acknowledgment that Spotless currently doesn't fully support Gradle's configuration cache and a path forward to make it work for everyone. We can think of less bold changes in Gradle that could help but changing the configuration cache base contract (in-jvm cache) or how the Gradle phases work ( |
I wish we were in person so I could be sure to convey the goodwill that I mean to convey. Fwiw I feel goodwill from you :)
I didn't suggest a bold change from you until you required one from me :). And I'm not requiring anything from you, I'm telling you the workaround I am using to get the fastest build that I can get. You can say "A plugin in good standing must serialize its tasks between JVMs", but as a user I can say "Ease of plugin development and interactive serial performance are more important to us than supporting Gradle's full feature set". You can sell a toaster that says "You must have multiple toasters in case you want to move bread from one toaster to another", but I can still just have one toaster and not move bread. That will work for me, and so I will never put in the work to support moving bread because I don't actually have to ever move bread. I do not want to be a thorn in your side, and if you want Spotless to abandon our workaround I am willing to merge a PR that does that. I have described what it would take - it's a huge job, and at the end Spotless is slower, and it's harder to add new formatters to Spotless, but I'm willing to merge it because I am not seeking to be a problem. I have outlined the work required informally in a few places, but I have added a more detailed description here
|
Hi Ned, |
Gradle 6.6 introduces the configuration cache. It looks useful, and it would be great to refactor Spotless to support it. I am okay with bumping our minimum required Gradle in order to support it. gradle/gradle#13490
The text was updated successfully, but these errors were encountered: