Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tolerance expressed in percentage now computes correctly. [BLD-522] #3489

Merged
merged 1 commit into from
May 21, 2014

Conversation

jmclaus
Copy link

@jmclaus jmclaus commented Apr 28, 2014

@auraz @valera-rozuvan It fixes the issue, tested it with -100 +- 10%; 100 +- 10%; 0 +- 10%; 10 +- 100%; -10 +- 100%. I guess we should add tests.

@singingwolfboy
Copy link
Contributor

Adding tests is always good!

@jmclaus
Copy link
Author

jmclaus commented Apr 29, 2014

@auraz @valera-rozuvan @singingwolfboy

This is an example of a test that did not pick up an error in the source code. Here are the details:

The erroneous code in src was (with complex1 : student result; complex2 : instructor result):

tolerance = tolerance * max(abs(complex1), abs(complex2))

In the original tests, the answer was 4.0 and the tolerance 10%. The range of correct answers is then [3.6, 4.4].

But 4.44 would generate a correct answer as

tolerance = 0.1 * max(4, 4.44) = 0.444 and the erroneous range of correct answers was set to [3.556, 4.444].

The tests would only check 4.5 and not a value closer to the bounds of correctness. I modified this but arbitrarily test 4.4000001 for incorrectness. Should something even tighter be put in place? In that case, what do you think it should be?

@auraz
Copy link
Contributor

auraz commented Apr 29, 2014

@auraz
Copy link
Contributor

auraz commented Apr 29, 2014

Also, I suggest to rename complex1 and complex2 to student_complex and teacher_complex or like that.

@auraz
Copy link
Contributor

auraz commented Apr 29, 2014

Regarding tighter tests: I can't imagine what needs to be tested tighter right now.

@auraz auraz closed this Apr 29, 2014
@auraz auraz reopened this Apr 29, 2014
# Mixed negative/positive range
problem = self.build_problem(answer=0, tolerance="10%")
correct_responses = ["0", "0.1", "-0.1", "0.10", "-0.10"]
incorrect_responses = ["", "-0.1000001", "0.1000001", "0"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why zero is in incorrect responses?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auraz No reason, removed.

@auraz
Copy link
Contributor

auraz commented Apr 29, 2014

@jmclaus for some reason Jenkins tests are not run for this PR.

@jmclaus
Copy link
Author

jmclaus commented Apr 30, 2014

@auraz Also, I suggest to rename complex1 and complex2 to student_complex and teacher_complex or like that. --> Renamed complex1 and complex2 to student_complex and instructor_complex.

@jmclaus
Copy link
Author

jmclaus commented Apr 30, 2014

@auraz Regarding tighter tests: I can't imagine what needs to be tested tighter right now. --> OK, I'll leave the new tests as is.

@jmclaus
Copy link
Author

jmclaus commented Apr 30, 2014

@auraz Please also fix https://github.com/edx/edx-platform/blob/master/common/lib/capa/capa/util.py#L33, as it is copy paste from https://github.com/edx/edx-platform/blob/master/common/lib/capa/capa/util.py#L36

Fixed.

@jmclaus
Copy link
Author

jmclaus commented Apr 30, 2014

@auraz Seems like a lot of current PR's do not have Jenkin tests running.

@auraz
Copy link
Contributor

auraz commented Apr 30, 2014

@jmclaus do you know how to run manual build?

@jmclaus
Copy link
Author

jmclaus commented May 6, 2014

@olmar (since @alex is out for the week), @valera-rozuvan @polesye The tests now all pass locally, please review.

The initial fix I did solved BLD-522 but was breaking other things. compare_with_tolerance is used throughout capa/capa/responsetype.py with a default_tolerance set to '0.001%' and expects it to be relative [tolerance * max(abs(student_complex), abs(instructor_complex))]. But when an instructor uses, for example, 4 +- 10% in Studio, he expects it to be absolute [tolerance = tolerance * abs(instructor_complex))]

See http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

@alex
Copy link

alex commented May 6, 2014

I am not the @alex you are looking for :-)

@jmclaus
Copy link
Author

jmclaus commented May 6, 2014

@alex Indeed, sorry! It was @auraz.

@olmar
Copy link
Contributor

olmar commented May 8, 2014

@jmclaus I reviewed and tested your pr manually also. It is OK for issue described in ticket.

@olmar
Copy link
Contributor

olmar commented May 8, 2014

but not sure about your concerns what expects instructor in studio. @valera-rozuvan please review also.

@jmclaus
Copy link
Author

jmclaus commented May 8, 2014

@olmar Thanks for review. Sorry, I wasn't clear enough. This fix actually gets the instructor what he expects (so no concern at all), 4 +- 10% will now give the following range of correct answers: [3.6, 4.4].

@olmar
Copy link
Contributor

olmar commented May 8, 2014

👍

@polesye
Copy link
Contributor

polesye commented May 9, 2014

@auraz will finish code review on Monday.

@@ -29,23 +29,28 @@ def compare_with_tolerance(complex1, complex2, tolerance=default_tolerance, rela
In [212]: 1.9e24 - 1.9*10**24
Out[212]: 268435456.0
"""
if isinstance(tolerance, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have not-a-string-type tolerance anywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O, I've found one place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous code, when tolerance was not ending on '%', it was going through 'evaluator(dict(), dict(), tolerance)'. Now it is going through only if it is string. Why it is so?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auraz Is there any reason to evaluate anything else than a string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know, It should be investigated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auraz compare_with_tolerance is used in only 1 file responsetypes.py 5 times (if we don't count tests). It's value is either float_info.epsilon, the default value (0.001%) or what is specified in the XML. Do you think this might need evaluation? I guess to be on the safe side, we should let it go through evaluator(dict(), dict(), tolerance) every time it's not a percentage. Thanks. Will do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auraz I take this back, if you pass a float through the evaluator, the following error is raised:

Traceback (most recent call last):
    common/lib/capa/capa/tests/test_util.py line 44 in test_compare_with_tolerance
      result = compare_with_tolerance(109.9, 100.0, 10.0, False)
    common/lib/capa/capa/util.py line 37 in compare_with_tolerance
      tolerance = evaluator(dict(), dict(), tolerance)
    common/lib/calc/calc/calc.py line 228 in evaluator
      if math_expr.strip() == "":
   AttributeError: 'float' object has no attribute 'strip'

I propose we leave things as is. I tested the code manually and tests all pass. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for looking into. I agree with you.

@auraz
Copy link
Contributor

auraz commented May 13, 2014

I suggest to also add small test especially for compare_with_tolerance function, because we have no tests for float tolerances.

@jmclaus
Copy link
Author

jmclaus commented May 14, 2014

@auraz I added unit tests for compare_with_tolerance to test_response_types.py. Or should we create a new file called test_util.py and put these in there?

@auraz
Copy link
Contributor

auraz commented May 14, 2014

@jmclaus new file is better decision.

@jmclaus
Copy link
Author

jmclaus commented May 14, 2014

@auraz I put the tests in a new file called test_util.py. They run and pass locally.

@jmclaus
Copy link
Author

jmclaus commented May 15, 2014

@auraz I have addressed all your comments then. Please finish review. Thanks.

@jmclaus
Copy link
Author

jmclaus commented May 19, 2014

@auraz Good to merge then? Tests all back to green.

@auraz
Copy link
Contributor

auraz commented May 21, 2014

👍

jmclaus pushed a commit that referenced this pull request May 21, 2014
…olerance

Tolerance expressed in percentage now computes correctly. [BLD-522]
@jmclaus jmclaus merged commit cc7987e into master May 21, 2014
@jmclaus jmclaus deleted the jmclaus/bugfix_numerical_response_tolerance branch May 21, 2014 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants