Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Optimization] Cache contract generation and pre-compile some match expression #2013

Merged
merged 3 commits into from
Aug 2, 2024

Conversation

yannham
Copy link
Member

@yannham yannham commented Jul 26, 2024

This PR is guided by the metrics gathered for some run of Nickel on large codebases, which showed that some contracts were generated again and again, while coming from one initial annotation. While it's expected that an annotation might give raise to many runtime evaluations (a contract in the body of a function, typically), it's wasteful that we generate the code at runtime many times (contract generation is a form of compilation from a static type to a Nickel expression, and it shouldn't happen all the time during execution).

The issue is that we now store types directly in the AST and consider them weak head normal form. So, each time we apply such a contract, contract/apply get a typ::Type, and need to make a contract out of it: generation happens again and again. This PR adds a new field to Term::Type, contract: RichTerm, which is populated at parsing time with the contract generated from the type (which is close to the behavior before the introduction of Term::Type). Doing so, we should only generate one contract for each user annotation in the code.

Metrics indeed show that in some big examples, the worst case was generating ~1700 times the same contract, which is brought down to one. We also tried to closurize the generated contract, so that it doesn't have to be evaluated several time, but this showed no improvement whatsoever, while it increases the number of allocated thunks, so it's been reverted.

Additionally, in the same spirit of "generating more statically, and don't do it again at runtime", we precompile the match expression generated when converting an enum contract, so that it's not recompiled again and again each time the contract is applied. This showed a reduction of around 25 to 30% of the total number of compilation of match expressions on the private benchmark.

The concrete time again aren't spectacular, as it's saving mostly Rust function calls, and not Nickel, the latter being of course the dominant cost, but there is still a small improvement.

After the introduction of a proper node for types in the AST, we
switched to a lazy way of generating contract: we keep the type as it
is, and convert it to a contract only once it's actually applied. While
simpler, this approach has the drawback of potentially wasting
computations by running the contract generation code many times for the
same contract, as some metrics showed (up to 1.6 thousand times on large
codebases).

This commit add a `contract` field to `Term::Type`, and statically
generates - in the parser - the contract corresponding to a type. This
is what we used to do before the introduction of a type node. Doing so,
we only generate at most one contract per user annotation, regardless of
how many times the contract is applied at runtime.
@github-actions github-actions bot temporarily deployed to pull request July 26, 2024 16:28 Inactive
This commit saves some re-compilation of the match expression generated
by contracts converted from enum types by generating the compiled
version directly instead of a match.
@yannham yannham requested a review from jneem July 26, 2024 16:28
@github-actions github-actions bot temporarily deployed to pull request July 26, 2024 16:31 Inactive
@aspiwack
Copy link
Member

What are the estimate performance gains from this PR?

@yannham
Copy link
Member Author

yannham commented Jul 29, 2024

What are the estimate performance gains from this PR?

As a disclaimer, the improvement is hard to measure properly with the micro benchmarks (aka it doesn't show), in particular because we don't really benchmark using the same contract again and again (which does happen in real life as measurements show, probably because of merging, array contracts, etc.), and I feel a micro benchmark on that would be rather artificial. The only example where it can show measurable effects is a 120kLoC real code base, which takes around 20sec to evaluate, which starts to be long running enough to see some differences.

I see a reduction of around 2% on several runs in user time (but system time, which is the predominant cost, is varying quite a bit, making the final result potentially less, at least on my very volatile laptop). Could be interesting to ask the original users, which are moreover using OsX, if they see something different in their setup.

What motivated this PR originally is:

  • a few commits earlier, master was atrociously slower than latest stable release on said example. The contract system rework was of course an important factor, but adding a type node to the AST is another thing we implemented after 1.7. In some sense, this PR is trying to do something closer to how it worked in 1.7, with the idea of closing the gap.
  • This PR only decreases strictly all the stuff that we measure at the level of the interpreter (thunks allocation, environment creation/size/cloning, all primop application, contract generation, etc.), so at the macro level, it should be a strict improvement. We're not changing the AST layout (that is, the size and alignement of the Term structure), which is already a big enum with larger variants. Of course it doesn't preclude memory layout effects that could indeed negate any potential benefit of this change, but I suppose it still limits the possibility.

I intend to do a larger rework of the AST after the next (imminent) release of 1.8 anyway, so this change might not be very impactful on the long term. It's more of a "back to 1.7 perf" band-aid. However it's not hugely impactful even now, so I'm fine ditching it if others think it's not worth it.

@github-actions github-actions bot temporarily deployed to pull request July 29, 2024 10:32 Inactive
@yannham
Copy link
Member Author

yannham commented Aug 2, 2024

I'm going to merge this for 1.8, if that's fine by everyone. It's not fundamentally very difficult to revert, and I plan to rework quite a bit the AST and introduce multiple IRs for 1.9 anyway, so this change might become obsolete.

@yannham yannham added this pull request to the merge queue Aug 2, 2024
Merged via the queue into master with commit b3c8e30 Aug 2, 2024
5 checks passed
@yannham yannham deleted the perf/cache-contract-gen branch August 2, 2024 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants