-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tolerance expressed in percentage now computes correctly. [BLD-522] #3489
Conversation
Adding tests is always good! |
@auraz @valera-rozuvan @singingwolfboy This is an example of a test that did not pick up an error in the source code. Here are the details: The erroneous code in src was (with complex1 : student result; complex2 : instructor result):
In the original tests, the answer was 4.0 and the tolerance 10%. The range of correct answers is then [3.6, 4.4]. But 4.44 would generate a correct answer as tolerance = 0.1 * max(4, 4.44) = 0.444 and the erroneous range of correct answers was set to [3.556, 4.444]. The tests would only check 4.5 and not a value closer to the bounds of correctness. I modified this but arbitrarily test 4.4000001 for incorrectness. Should something even tighter be put in place? In that case, what do you think it should be? |
@jmclaus this is not copy-paste error. This code was originally there: Please also fix https://github.com/edx/edx-platform/blob/master/common/lib/capa/capa/util.py#L33, as it is copy paste from https://github.com/edx/edx-platform/blob/master/common/lib/capa/capa/util.py#L36 |
Also, I suggest to rename complex1 and complex2 to student_complex and teacher_complex or like that. |
Regarding tighter tests: I can't imagine what needs to be tested tighter right now. |
# Mixed negative/positive range | ||
problem = self.build_problem(answer=0, tolerance="10%") | ||
correct_responses = ["0", "0.1", "-0.1", "0.10", "-0.10"] | ||
incorrect_responses = ["", "-0.1000001", "0.1000001", "0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why zero is in incorrect responses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@auraz No reason, removed.
@jmclaus for some reason Jenkins tests are not run for this PR. |
@auraz |
@auraz |
@auraz Fixed. |
@auraz Seems like a lot of current PR's do not have Jenkin tests running. |
@jmclaus do you know how to run manual build? |
@olmar (since @alex is out for the week), @valera-rozuvan @polesye The tests now all pass locally, please review. The initial fix I did solved BLD-522 but was breaking other things. See http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ |
I am not the @alex you are looking for :-) |
@jmclaus I reviewed and tested your pr manually also. It is OK for issue described in ticket. |
but not sure about your concerns what expects instructor in studio. @valera-rozuvan please review also. |
@olmar Thanks for review. Sorry, I wasn't clear enough. This fix actually gets the instructor what he expects (so no concern at all), 4 +- 10% will now give the following range of correct answers: [3.6, 4.4]. |
👍 |
@auraz will finish code review on Monday. |
@@ -29,23 +29,28 @@ def compare_with_tolerance(complex1, complex2, tolerance=default_tolerance, rela | |||
In [212]: 1.9e24 - 1.9*10**24 | |||
Out[212]: 268435456.0 | |||
""" | |||
if isinstance(tolerance, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have not-a-string-type tolerance anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O, I've found one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In previous code, when tolerance was not ending on '%', it was going through 'evaluator(dict(), dict(), tolerance)'. Now it is going through only if it is string. Why it is so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@auraz Is there any reason to evaluate anything else than a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know, It should be investigated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@auraz compare_with_tolerance
is used in only 1 file responsetypes.py
5 times (if we don't count tests). It's value is either float_info.epsilon
, the default value (0.001%
) or what is specified in the XML. Do you think this might need evaluation? I guess to be on the safe side, we should let it go through evaluator(dict(), dict(), tolerance)
every time it's not a percentage. Thanks. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@auraz I take this back, if you pass a float through the evaluator, the following error is raised:
Traceback (most recent call last):
common/lib/capa/capa/tests/test_util.py line 44 in test_compare_with_tolerance
result = compare_with_tolerance(109.9, 100.0, 10.0, False)
common/lib/capa/capa/util.py line 37 in compare_with_tolerance
tolerance = evaluator(dict(), dict(), tolerance)
common/lib/calc/calc/calc.py line 228 in evaluator
if math_expr.strip() == "":
AttributeError: 'float' object has no attribute 'strip'
I propose we leave things as is. I tested the code manually and tests all pass. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for looking into. I agree with you.
I suggest to also add small test especially for compare_with_tolerance function, because we have no tests for float tolerances. |
@auraz I added unit tests for |
@jmclaus new file is better decision. |
@auraz I put the tests in a new file called |
@auraz I have addressed all your comments then. Please finish review. Thanks. |
@auraz Good to merge then? Tests all back to green. |
👍 |
…olerance Tolerance expressed in percentage now computes correctly. [BLD-522]
@auraz @valera-rozuvan It fixes the issue, tested it with -100 +- 10%; 100 +- 10%; 0 +- 10%; 10 +- 100%; -10 +- 100%. I guess we should add tests.