You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know v1 is old news, so I'm mostly posting this for posterity and don't mind if it gets closed as "won't fix".
QuickSpec v1, e.g. 0.9.6 from Hackage, has a pretty severe space leak, eating up GBs of RAM for very simple theories, like the following:
data Nat = Z | S Nat
z = Z
s = S
p (S x) = x
plus Z y = y
plus (S x) y = S (plus x y)
times Z y = Z
times (S x) y = plus y (times x y)
exp x Z = S Z
exp x (S y) = times x (exp x y)
max Z y = y
max (S x) Z = S x
max (S x) (S y) = S (max x y)
From what I can tell, the problem is exp building huge unary Nat values. From what I can tell, this shouldn't be a problem due to laziness: if we have x :: Nat and y :: Nat, we should be able to compare them with == or <= in constant space: if the outer constructors differ, return an appropriate Bool; if they're both S then perform a tail call to the inner thunks and garbage-collect the outer ones.
I dug into this a little, but gave up. My methodology was to build everything with profiling support, and run with +RTS -M100m -xc; this causes an out-of-memory exception when the heap reaches 100MB, and prints out a stack trace of the exception which helps to narrow down the problem. I found the following:
In Test.QuickSpec.Generate.generateTermsSatisfying there is a space leak on the following line:
quietly $ printf "%d terms, " (count sum length ts)
As far as I understand it, ts is a list built out of the nodes/leaves of the test tree up to the current depth. Getting the length forces this whole chunk of the test tree to be evaluated. Since the test tree is also used later on to do the actual exploration, it can't be garbage collected and hence leaks space.
If the shutUp option is given to this function, such messages are skipped, so the leak disappears. Unfortunately the Test.QuickSpec.Main module seems to hard-code shutUp to be False. I just deleted these lines to fix the problem (the Depth ... lines are fine; except they're sent to stdout when they should really go to stderr).
The next space leak showed up in Test.QuickSpec.TestTree.test'. It looks like the problem is partitionBy, which forces elements in order to sort them. The definition of partitionBy turns the given list into a list of pairs, with the original elements in the fst position and the processed elements in snd position; this list is sorted by the snd position, grouped by the snd position, then the fst values are extracted.
I tried changing this to avoid having these complete listsin memory at once. Rather than pairing up all elements with their processed version, I tried pushing the value function into the comparison function, so we'd take in two normal elements, send them through value and compare the results. This trades space for time: we don't have to store the results of running value on every element, but we do have to run value over and over as we compare.
This didn't seem to help; I imagine it's the input list itself that's too big. A further optimisation might be to push the evaluation of these list elements into the comparison function: accept values x and y from the list, clone their thunks, evaluate those thunks, send them through value and compare the results. This way, the thunks in the list will remain unevaluated, and the cloned thunks can be garbage-collected after each comparison. I gave up at this point because:
Current GHC doesn't support cloning thunks. I tried porting https://hackage.haskell.org/package/ghc-dup to the GHC I'm using (7.10.3), but the C-- code is performing a trick which is difficult to port:
When deepDup is called on a value, it's passed to deepDupFun.
The C-- code for performing the duplication recurses through the given thunk until it finds a call to deepDupFun.
To check whether a thunk is a call to deepDupFun, the symbol name is compared to a hard-coded constant.
GHC has changed the way it generated symbol names. Rather than using the package name, it now uses the package key.
Package keys are generated at compile time, based on a Merkle tree of the dependencies' package keys, so there's no way to hard-code it. I tried using CPP but still got errors about missing symbols in the .so file.
I'm not sure this would even solve the memory problem, e.g. if the list of thunks is too big. Maybe we'd need even more invasive changes, like sorting an array of indices which generate elements on-demand, etc.
Even if this space leak is fixed, there's probably another one waiting to be triggered :(
The text was updated successfully, but these errors were encountered:
Yikes, that's awkward that just printing the statistics causes a space leak...
One trick you could try is using observational equality instead of Ord. For example, you can define an observer function for Nat that simply converts it into an integer. I think the Nats will then not be stored in memory.
In general, you can do this trick for any datatype by giving a hash function as the observer function.
I know v1 is old news, so I'm mostly posting this for posterity and don't mind if it gets closed as "won't fix".
QuickSpec v1, e.g. 0.9.6 from Hackage, has a pretty severe space leak, eating up GBs of RAM for very simple theories, like the following:
From what I can tell, the problem is
exp
building huge unaryNat
values. From what I can tell, this shouldn't be a problem due to laziness: if we havex :: Nat
andy :: Nat
, we should be able to compare them with==
or<=
in constant space: if the outer constructors differ, return an appropriateBool
; if they're bothS
then perform a tail call to the inner thunks and garbage-collect the outer ones.I dug into this a little, but gave up. My methodology was to build everything with profiling support, and run with
+RTS -M100m -xc
; this causes an out-of-memory exception when the heap reaches 100MB, and prints out a stack trace of the exception which helps to narrow down the problem. I found the following:In
Test.QuickSpec.Generate.generateTermsSatisfying
there is a space leak on the following line:As far as I understand it,
ts
is a list built out of the nodes/leaves of the test tree up to the current depth. Getting the length forces this whole chunk of the test tree to be evaluated. Since the test tree is also used later on to do the actual exploration, it can't be garbage collected and hence leaks space.If the
shutUp
option is given to this function, such messages are skipped, so the leak disappears. Unfortunately theTest.QuickSpec.Main
module seems to hard-codeshutUp
to beFalse
. I just deleted these lines to fix the problem (theDepth ...
lines are fine; except they're sent tostdout
when they should really go tostderr
).The next space leak showed up in
Test.QuickSpec.TestTree.test'
. It looks like the problem ispartitionBy
, which forces elements in order to sort them. The definition ofpartitionBy
turns the given list into a list of pairs, with the original elements in thefst
position and the processed elements insnd
position; this list is sorted by thesnd
position, grouped by thesnd
position, then thefst
values are extracted.I tried changing this to avoid having these complete listsin memory at once. Rather than pairing up all elements with their processed version, I tried pushing the
value
function into the comparison function, so we'd take in two normal elements, send them throughvalue
and compare the results. This trades space for time: we don't have to store the results of runningvalue
on every element, but we do have to runvalue
over and over as we compare.This didn't seem to help; I imagine it's the input list itself that's too big. A further optimisation might be to push the evaluation of these list elements into the comparison function: accept values
x
andy
from the list, clone their thunks, evaluate those thunks, send them throughvalue
and compare the results. This way, the thunks in the list will remain unevaluated, and the cloned thunks can be garbage-collected after each comparison. I gave up at this point because:deepDup
is called on a value, it's passed todeepDupFun
.deepDupFun
.deepDupFun
, the symbol name is compared to a hard-coded constant..so
file.The text was updated successfully, but these errors were encountered: