-
-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better numerical accuracy testing #10952
Comments
Attachment: noise.patch.gz |
This comment has been minimized.
This comment has been minimized.
comment:2
Amazing speed getting this up! Shouldn't "((?:abs(?:solute)?)" be "((?:abs(?:olute)?)" ? |
comment:3
Nitpicky: This comment is now incorrect:
Also, something should be added to the docs about this new option. Perhaps here: http://sagemath.org/doc/developer/conventions.html#further-conventions-for-automated-testing-of-examples |
apply only this patch to the bin repo |
Attachment: 10952-tol-bin.patch.gz |
comment:4
Attachment: 10952-tol-doc.patch.gz Thanks for the comments, new patches attached. |
comment:6
Would this works with test of that kind as well:
or even
(test failures courtesy of my work on python-2.7) |
comment:7
Yes, it would (though you do have to explicitly annotate the test with a tolerance, and if Python 2.7 consistently gives, e.g., 3.4 then we should change the doctest rather than make it fuzzy.) |
comment:8
In trying to review this ticket, I applied 10952-tol-bin.patch in the local/bin directory of a skynet/taurus (x86_64-Linux-nehalem) build of sage-4.7.rc2. I next did 'sage -b'. Yet I get
Am I missing something? |
comment:9
No, that is correct. The example in the description is an example of something that would pass doctests (even though the true output doesn't match exactly, you're seeing what the actual output is). |
comment:10
Another nitpicky: the doc patch misspells "arithmetic". Definitely not big enough to stop a positive review, but also not big enough to be worth my effort in making a new patch right now. |
comment:11
Attachment: 10952-tol-doc.2.patch.gz Argh. Fixed. |
comment:12
Thanks! If we had a github pull system, or a system where I could edit the patch inline, I would have fixed it. |
comment:13
Replying to @jasongrout:
I have to admit that would be convenient. But wouldn't that create a privileged class of commit people who don't need review? At least that seems to be the model for the projects I'm familiar with. |
Reviewer: Jason Grout |
Author: Robert Bradshaw |
comment:14
That person would be the release manager. He would be the only one able (or supposed to) merge into the master branch. On the other hand, given our problems with finding release managers who could do the entire release process, maybe sharing the burden to commit patches to master wouldn't be a bad idea. |
comment:15
I applied 10952-tol-bin.patch in the local/bin directory of a skynet/taurus (x86_64-Linux-nehalem) build of sage-4.7.rc2. I next did 'sage -b'. I then modified a doctest to include the lines in
I then did 'sage -b' again. When I run the doctest, I get
Note that if I put the lines
in the doctest, then the doctest passes. |
comment:31
Replying to @jhpalmieri:
But that doesn't match You can of course match funny things first, i.e. use more general patterns, but you then have to check the matched expression further before passing it to The whole idea isn't that bad, but carries the danger that people simply increase the tolerance (too much) just to make doctests pass somehow, without caring where the variations originate from. |
comment:32
But Re the issue of raising tolerance, I hope that referees will keep an eye on that sort of thing. Dave Kirkby, for example, is a strong advocate of not arbitrarily raising tolerance. |
Attachment: 10952-tol-bin.2.patch.gz |
comment:33
I've fixed the issue with blanklines terminating example blocks. As for the regular expression for floats, I don't see any need to make it more complicated--I'd rather match miss-typed numbers and raise an error than silently ignore them. If you feel strongly, this could be changed. Referees should keep an eye on precision. This is back to my philosophy that a computer should run doctests and a human read the code (though we could add a patchbot plugin to flag drops in precision). It's not a new issue--we've been using "..." for quite a while now which is less flexible and more dangerous (as it can match more than just a single number). |
This comment has been minimized.
This comment has been minimized.
comment:35
How about emitting the extra code only if That's a simple |
comment:36
You're talking about the |
comment:37
Replying to @robertwb:
Added complexity? One boolean variable, initialized to false and set to This not only reduces the file size and run (import!) time but also avoids more potential name clashes. Why should one add useless code to each and every file to doctest not using |
comment:38
Either we make this variable a global (ugly) or return it on from call to doc_preparse along with the parsed docstring (also ugly). All to save a fraction of a block (~1KB) of an ephemeral file and < 1ms on an already fairly expensive operation. If we're looking to cut fat, it'd be better to avoid the quadratic-time creation of the doctest file string. The name clash is a red herring--if we avoid it only because # tol is not used, that's a potentially latent bug in my mind. The function could be renamed if need be. |
comment:39
I don't see anything wrong with more complicated regular expressions, but I like regular expressions. So leave it as is if you want. Meanwhile, the output for failures using tolerances isn't nice, compared to other sorts of failures:
Notice that the line number is missing and the traceback is present. I'm attaching a referee patch which tries to fix this. Please take a look. For me, this gives this sort of output (with a slightly different file):
Since the whole ticket had a positive review earlier, I think if you're happy with my changes, you can switch it back to that status. Finally, the issue with doctests in |
This comment has been minimized.
This comment has been minimized.
scripts repo; apply on top of other scripts patch |
comment:40
Attachment: trac_10952-ref.patch.gz Replying to @jhpalmieri:
Yes, only
should always pass. ;-)
Haven't tested your patch (nor looked at it), but the output you gave looks much better.
I won't object, though I in general don't like the idea of adding code (regardless how small the overhead or impact might be) unconditionally, i.e., if there's no need to do so. We have similar situations all around, where everybody adds "just a little", IMHO. |
comment:41
Replying to @nexttime:
Currently, it raises an error indicating that the doctest itself is bad.
Yes, it's a tradeoff between adding "just a little" python parsing overhead to the testing infrastructure or "just a little" human parsing overhead to the testing infrastructure. I'd rather put the burden on machines than developers in this case. The reviewer patches look good, certainly an improvement, so I'm setting this back to positive review unless there's something else that needs to be taken care of. |
This comment has been minimized.
This comment has been minimized.
Changed keywords from sd32 to sd32 noise noisy doctest failure error tolerance |
Merged: sage-4.7.2.alpha3 |
comment:44
I'd like to draw the attention of folks here to #12493 comment:7. Apparently one can't do In fact, I only found one occurrence in 5.0.beta3. Can that be right? This has been in Sage for months!
Anyway, even if my analysis is wrong (let's hope it's easier than that), I figure the people here can give a quick diagnosis of #12493. |
comment:45
Actually, #12493 comment:8 is even better! Optional and tol don't play well together. |
If a line contains
tol
ortolerance
, numerical results are onlyverified to the given tolerance. This may be prefixed by
abs[olute]
orrel[ative]
to specify whether to measure absolute or relative error; defaults to relative error except when the expected value is exactly zero:This can be useful when the exact output is subject to rounding error and/or processor floating point arithmetic variation.
Related:
.zero_at(epsilon)
methods, to fix noisy (and signed) zeroes; see for example zero_at() method for RDF/CDF vectors #11848.Apply
to the Sage scripts repository.
Apply
to the Sage library repository.
CC: @jasongrout @kcrisman
Component: doctest coverage
Keywords: sd32 noise noisy doctest failure error tolerance
Author: Robert Bradshaw, Rob Beezer
Reviewer: Jason Grout, Mariah Lenox, William Stein, John Palmieri
Merged: sage-4.7.2.alpha3
Issue created by migration from https://trac.sagemath.org/ticket/10952
The text was updated successfully, but these errors were encountered: