-
-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix slow doctests or mark # long time #35443
Conversation
|
A test should take < 1s or else be marked # long time. When possible we fix the test to take less time, otherwise we just mark the test as long time.
Rebased to 10.0.beta9 and added more commits. This is long but it should be easy to review since it's mostly adding A few changes are either reducing the size of the test, or fixing so it takes less time and it doesn't need to be marked long time. If it is easier to review, I could either With this PR + a few more changes that I will PR separately, I have no tests taking more than ~ 5s. This saves ~10-15% of total test time (from 215s to 187s with -tp 32). Bear in mind that some tests are much faster with -tp1 than with -tp32. There are still ~ 700 tests taking more than ~ 1s, but I will stop here. |
@@ -298,6 +298,7 @@ def __init__(self, n, q, D, secret_dist='uniform', m=None): | |||
|
|||
sage: from numpy import std | |||
sage: while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01: | |||
....: L = [] # reset L to avoid quadratic behaviour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the idea of this test that by increasing the number of samples, the error bound will be hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. To be honest I'm not sure what is the role of this test, but the previous implementation exhibits quadratic behavior which is the cause of this test usually being ok but sometimes being very slow:
sage -t --warn-long --random-seed=110988274722243807127083377606682083581 src/sage/crypto/lwe.py
**********************************************************************
File "src/sage/crypto/lwe.py", line 300, in sage.crypto.lwe.LWE.__init__
Warning, slow doctest:
while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01:
add_samples()
Test ran for 16.66 s, check ran for 0.00 s
[112 tests, 17.29 s]
vs.
sage -t --warn-long 0.4 --random-seed=1 src/sage/crypto/lwe.py
**********************************************************************
File "src/sage/crypto/lwe.py", line 300, in sage.crypto.lwe.LWE.__init__
Warning, slow doctest:
while abs(std([e if e <= 200 else e-401 for e in S()]) - 3.0) > 0.01:
add_samples()
Test ran for 0.47 s, check ran for 0.00 s
[112 tests, 1.37 s]
As far as I understand, they want to show that these samples indeed have a normal distribution with standard deviation 3.0. They take 1000 samples and want the standard deviation of these to be close to 3.0. Otherwise they keep adding samples, etc. until the standard deviation of the samples is indeed close to 3.0.
However, the way this is implemented it becomes O(n^2) when they have to try 1000n samples.
With my change, instead of adding more samples, we take a new set of 1000 samples. In this way, trying 1000n samples is O(n). So, even if we have to try more times, this is better.
Moreover, now this is linear instead of quadratic, it's even faster too try a sample of 100.
Summary: the new way the test is done, it keeps computing the standard deviation of 100 samples until it's really close to 3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you are happy with my explanation. Otherwise, I'll revert and place a # long time
label (although I'd be more inclined to just nuke the test).
I think this was the only non-cosmetic objection you had (and the cosmetic ones are more or less all addressed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would be more comfortable if we just get rid of the while
loop, compute and print the std and mark the result as random
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, your solution is of course fine.
It takes several hours for me to run the test suite without NB: now that we've moved to Github, our notifications are once again being sent through SendGrid who regularly and intentionally violate the mail RFCs to delete my notifications (https://www.mail-archive.com/sage-devel@googlegroups.com/msg88600.html). Please keep that in mind if you ever want to draw my attention to a ticket. |
@mkoeppe Thanks for your review. I added your suggestions. Also a minor change to a doctest suggested by codecov (it turns out I changed one line of code in |
For me it is now taking 4786 cputime seconds, or 187 wall time (using
I'm sorry about that. EEE at work. |
sage: L.<b> = K.extension(x^2 + 26) # optional - sage.rings.number_field | ||
sage: EL = E.change_ring(L) # optional - sage.rings.number_field | ||
sage: iso2 = EL.isogenies_prime_degree(2); len(iso2) # optional - sage.rings.number_field | ||
sage: pol = NumberField(pol26,'a').optimized_representation()[0].polynomial() # optional - sage.rings.number_field, long time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks... I can do a grep and find all of those. Is there a style guide about this? I was often unsure which column to place the first #
, separation between labels, etc. The only rule I know is that the first #
needs to have 2 spaces before. Other than that, every convention I could think about is represented in some part of the code...
E.g. some places do # optional - A # optional - B
but other places do # optional - A B
, etc.
I'm not even sure about the "legal" syntax, much less about the "preferred" style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should definitely add a style guide for this; and a linter/fixer for these would probably also be useful (see #35401).
Both forms are correct. The style # optional - A # optional - B
in many places comes from using my simple editor macros.
@mkoeppe I did reorder almost all These are all the exceptions:
Only the last three seem like they could be reordered, however, the |
I think I'm done here, unless something else is really necessary, I'd rather finish this PR. Aside: this "codecov" check is quite annoying, since I don't know how to make it happy. It seems all of my latest PRs are marked check failure because of this. A couple had actual errors, but since all of them have red crosses, it's not immediate to tell which ones. I think we should be really serious about CI passing and PRs be reworked if some check fail. But it's quite frustrating to be aiming for a moving target that we don't know how it works. Is it possible that the codecov checks run and indicate something but that they are not taken into account in the global "pass/fail" decision for a PR? Also maybe it's better to have a separate repo / branch where CI experiments are carried before being pushed to develop? |
sage: K = NumberField(x**2 - 29, 'a'); a = K.gen() | ||
sage: E = EllipticCurve([1, 0, ((5 + a)/2)**2, 0, 0]) | ||
sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972]. | ||
sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) # See Section 5.10 of [Ser1972]. # long time | ||
[3, 5, 29] | ||
sage: E = EllipticCurve_from_j(1728).change_ring(K) # CM | ||
sage: sage.schemes.elliptic_curves.gal_reps_number_field._non_surjective(E) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is really far right (column 130). However, it seems the comment before may be more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In situations like this, I have often rewritten the test as from sage.schemes.elliptic_curves.gal_reps_number_field import _non_surjective
(even if this import line is very long).
I haven't followed the recent work on codecov; maybe @tobiasdiez or @kwankyu can comment on this |
I don't really have a strong preference for the order of In any case, aligning the annotations in a column certainly reduces the visual clutter and is a good thing to do when one makes changes to these lines anyway. |
From what I observed its actually not an issue with codecov but that some tests have random input and thus trigger different code paths. I've opened #35522 for this. If you experience any other problems, please open a new issue and I'll have a look. |
Please have a look at the
However, if you look at the diff I added a test in lines 1046 and 1047 that would fail without the change I did to L1064. If that is not covering this line, please tell me what would cover that change. As for the other issue: maybe codecov could be run with |
Looking at https://app.codecov.io/gh/sagemath/sage/pull/35443/blob/src/sage/plot/animate.py#L1064 I think I understand what is going on here. The Maybe an option is to run long tests at least just for those files that are changed in the patch. In fact, it'd be nice to run the whole testsuite in "normal" mode and the changed files a second time in "long" mode. This could also catch cases when doctesting works in "long" mode but it doesn't in "normal" mode because of a missing In this particular case, maybe there could be a test that calls But before worrying about that, we should be clear about what is the expectation: do we aim for 100% code coverage on normal test? on long test? Is this an aim for the whole codebase, or just for lines that change? Whatever is the answer to those questions, IMO we must stick to them, either do not merge PRs that don't pass coverage check (with few reasonable exceptions) or else don't make coverage failure part of PR failure. Otherwise, we risk making the whole CI check useless. |
As per my previous comment, I added a small quick test that should satisfy |
Documentation preview for this PR is ready! 🎉 |
I don't think such a hybrid mode is supported (yet) by our doctest framework, or is it? Maybe we can always run all long tests in CI, or would this be to long?
As far as I understand it, sage has a high priority on writing tests with a high coverage. Striving for 100% coverage however is usually not a good idea, since the additional tests you create to cover "trivial branches" create a maintenance overhead without providing real value. Maybe we should move this discussion to a new issue? |
I'd be +1 on running the long tests in CI. And before running the long tests, perhaps we can run the changed files of the PR first (similar to |
📚 Description
A test is supposed to take < 1s or else be marked # long time.
Here we consider slow tests taking >> 10s. When possible we fix or change the test to take less time, otherwise we just mark the test as long time. Occasionally we create a new smaller test and keep the original one as long.
After this and #35442 the slowest tests are a few taking ~ 10s.
The total time to doctest all goes down from 880 to 806 seconds (using
-tp 8 --all
).NOTE: there's a minor merge conflict with #35314 which I will resolve once that PR is merged.📝 Checklist