-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defer CallStack
computations by representing them as data constructors.
#1202
Conversation
…ors. This avoids expensive calls to `combineCallStacks` in the common case; the actual call stack values are only computed in the case where they need to be printed out.
On a particular large example computation I have, which divides some degree-14,000 integer polynomials, here are the runtime numbers (when running without profiling):
So we're still paying a significant penalty for call stacks, but a significantly smaller one. |
@@ -92,7 +113,13 @@ combineCallStacks :: | |||
CallStack {- ^ call stack of the application context -} -> | |||
CallStack {- ^ call stack of the function being applied -} -> | |||
CallStack | |||
combineCallStacks appstk fnstk = appstk <> dropCommonPrefix appstk fnstk | |||
combineCallStacks appstk fnstk = CombineCallStacks appstk fnstk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how common this is, but is it worth checking for the special cases where one of the arguments is the EmptyCallStack
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. That would cost very little to check for, and it would strictly reduce the memory consumption whenever that rule matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added special cases for EmptyCallStack
. Looking at +RTS -s
, any performance differences (on my use case at least) are lost in the noise. But we might as well keep it in; it only reduces total allocation by 0.01% but it might be more on other examples.
I don't know if we should do anything about it now, but I've noticed in the past that the call stack combination algorithm doesn't always produce the results I expect. My hypothesis is that this is because the algorithm I lifted from GHC assumes that lambda-lifting has been done, whereas Cryptol does not do this pass. As a result, functions nested with |
This avoids expensive calls to
combineCallStacks
in the common case; the actual call stack values are only computed in the case where they need to be printed out.This helps with #1201.