Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defer CallStack computations by representing them as data constructors. #1202

Merged
merged 2 commits into from
May 27, 2021

Conversation

brianhuffman
Copy link
Contributor

This avoids expensive calls to combineCallStacks in the common case; the actual call stack values are only computed in the case where they need to be printed out.

This helps with #1201.

…ors.

This avoids expensive calls to `combineCallStacks` in the common case;
the actual call stack values are only computed in the case where they
need to be printed out.
@brianhuffman brianhuffman requested a review from robdockins May 26, 2021 19:45
@brianhuffman
Copy link
Contributor Author

On a particular large example computation I have, which divides some degree-14,000 integer polynomials, here are the runtime numbers (when running without profiling):

  • Cryptol master, with call stacks enabled: 50.7 s
  • Cryptol master, with call stacks disabled: 17.4
  • lazy-callstack branch, call stacks enabled: 28.0 s

So we're still paying a significant penalty for call stacks, but a significantly smaller one.

@@ -92,7 +113,13 @@ combineCallStacks ::
CallStack {- ^ call stack of the application context -} ->
CallStack {- ^ call stack of the function being applied -} ->
CallStack
combineCallStacks appstk fnstk = appstk <> dropCommonPrefix appstk fnstk
combineCallStacks appstk fnstk = CombineCallStacks appstk fnstk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how common this is, but is it worth checking for the special cases where one of the arguments is the EmptyCallStack?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. That would cost very little to check for, and it would strictly reduce the memory consumption whenever that rule matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added special cases for EmptyCallStack. Looking at +RTS -s, any performance differences (on my use case at least) are lost in the noise. But we might as well keep it in; it only reduces total allocation by 0.01% but it might be more on other examples.

@robdockins
Copy link
Contributor

I don't know if we should do anything about it now, but I've noticed in the past that the call stack combination algorithm doesn't always produce the results I expect. My hypothesis is that this is because the algorithm I lifted from GHC assumes that lambda-lifting has been done, whereas Cryptol does not do this pass. As a result, functions nested with where sometimes get odd results.

@brianhuffman brianhuffman merged commit 33d7273 into master May 27, 2021
@RyanGlScott RyanGlScott deleted the lazy-callstack branch March 22, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants