0.0000000000000000000000000000 is parsed completely differently than 1.0000000000000000000000000000 for no good reason #8896

williamstein · 2010-05-05T20:17:53Z

In Sage 0.0 and 0.00000000000000000000000000000000000000 denote exactly the same thing (a zero with the same precision). However, that 1.0 and 1.00000000000000000000000000000000000000 are different in Sage:

sage: 0.0
0.000000000000000
sage: 0.00000000000000000000000000000000000000
0.000000000000000
sage: parent(0.00000000000000000000000000000000000000)
Real Field with 53 bits of precision
sage: 1.00000000000000000000000000000000000000
1.0000000000000000000000000000000000000
sage: 1.0
1.00000000000000
sage: parent(1.00000000000000000000000000000000000000)
Real Field with 130 bits of precision
sage: parent(1.0)
Real Field with 53 bits of precision

Some people consider the above inconsistency a bug.
See the discussion below and at sage-devel.

Apply attachment: 8896-rebased.patch

CC: @jasongrout

Component: basic arithmetic

Keywords: sd32

Author: Robert Bradshaw

Reviewer: Mariah Lenox

Issue created by migration from https://trac.sagemath.org/ticket/8896

The text was updated successfully, but these errors were encountered:

robertwb · 2010-05-05T22:53:47Z

comment:1

This is because 0.00000000000000000000000000000000005 is not considered to have "high precision." We should special-case 0.

nexttime · 2010-05-06T01:06:33Z

comment:2

Btw, Sage does not distinguish +0.0 and -0.0 (and 0.0).

Nice how trac interprets long decimal fractions of floats. :D

robertwb · 2010-05-15T22:36:34Z

Attachment: 8896-highprec-zero.patch.gz

kcrisman · 2010-05-26T18:02:42Z

comment:5

Looks okay in general, but the last block

else: 
        # Must be 0.00000000000000...0 
        sigfigs = len(mantissa)

needs to be indented, n'est pas?

In the tests, there should be (brief) words to the effect that the first two have the default minimum precision and that is what is being tested for, while in the new tests we are testing for getting high precision.

nexttime · 2010-05-27T13:59:08Z

comment:6

Replying to @kcrisman:

Looks okay in general, but the last block
else: 
        # Must be 0.00000000000000...0 
        sigfigs = len(mantissa) 
needs to be indented, n'est pas?

No, the else belongs to the for statement.
But currently, leading zeros contribute to the precision, too: :)

sage: RealNumber(0.000000000000000000).prec()
67
sage: RealNumber(00.000000000000000000).prec()
70
sage: RealNumber(.000000000000000000).prec()
64

I'm not sure if this is intentional, it's at least uncommon.
(And a leading + also increases the precision.)

It looks as if a decimal point is counted as a significant digit.

And, sorry, the code is extremely ugly (not due to the patch) and inefficient.
Examples and parameter description can be improved as well.

I'm not quite sure what kind of strings actually get in (i.e., where/if illegal syntax is catched, it seems all is left to RealNumber._set() if not already handled by the parser), this should perhaps be documented, too.

nexttime · 2010-05-27T15:03:06Z

comment:7

Also, is it intentional that the exponent is ignored when computing the required precision?
E.g., 1.0e-1000000000 as well as 1.0e-10000000000 evaluate to zero, because of only 53 bits precision.

nexttime · 2010-05-27T16:01:29Z

comment:8

sage.rings.complex_number.create_ComplexNumber() also needs work.

(It doesn't "strip" trailing/fractional zeros, but also doesn't treat a given exponent correctly. Bases (other than 10) aren't supported; perhaps it should just call create_RealNumber() or use the same code to compute the necessary precision once this operates as desired.)

robertwb · 2010-05-27T18:48:51Z

apply on top of previous

robertwb · 2010-05-27T18:51:01Z

comment:9

Attachment: 8896-part2.patch.gz

Replying to @nexttime:

No, the else belongs to the for statement.

Exactly.

But currently, leading zeros contribute to the precision, too: :)
sage: RealNumber(0.000000000000000000).prec()
67
sage: RealNumber(00.000000000000000000).prec()
70
sage: RealNumber(.000000000000000000).prec()
64
I'm not sure if this is intentional, it's at least uncommon.

That's the point of this ticket. For 0, all zeros are leading.

(And a leading + also increases the precision.)

It looks as if a decimal point is counted as a significant digit.

Good catch, fixed. That required some adjustment to keep the input/str in sync.

And, sorry, the code is extremely ugly (not due to the patch) and inefficient.
Examples and parameter description can be improved as well.

I'm not quite sure what kind of strings actually get in (i.e., where/if illegal syntax is catched, it seems all is left to RealNumber._set() if not already handled by the parser), this should perhaps be documented, too.

Updated the docstring a bit. This is mostly for use by the preparser, though of course it gets used directly as well.

robertwb · 2010-05-27T18:52:59Z

comment:10

Replying to @nexttime:

Also, is it intentional that the exponent is ignored when computing the required precision?
E.g., 1.0e-1000000000 as well as 1.0e-10000000000 evaluate to zero, because of only 53 bits precision.

Yes, of course. The size of the exponent is completely orthogonal to the number of significant figures. I made create_ComplexNumber defer to create_RealNumber as well.

kcrisman · 2010-05-27T19:29:04Z

comment:11

No, the else belongs to the for statement.

Exactly.

Here is the reference for this - I hadn't seen that before. Thanks for the additional docstring.

nexttime · 2010-05-27T23:00:20Z

comment:12

Just took a first glance:

create_ComplexNumber() now looks nice
in the above function, it's len<15 instead of len<=15
min_prec can be smaller than 53 (if specified)
pad should be (tested to be) non-negative
(min_prec perhaps too, or min_prec+pad>=0)
compare against min_prec+pad rather than individually for the "common" case
rnd description in create_RealNumber() is slightly messed up
IMHO leading zeros left to the decimal point should be stripped/ignored

More to come... ;-)

(I've only looked at the patch to the patch.)

nexttime · 2010-05-27T23:26:56Z

comment:13

Replying to @robertwb:

Replying to @nexttime:
But currently, leading zeros contribute to the precision, too: :)
sage: RealNumber(0.000000000000000000).prec()
67
sage: RealNumber(00.000000000000000000).prec()
70
sage: RealNumber(.000000000000000000).prec()
64
I'm not sure if this is intentional, it's at least uncommon.
That's the point of this ticket. For 0, all zeros are leading.

Not really. We were (only I think) considering zeros of the fractional part.

Truncation only makes sense from the right; and you don't express precision by padding zeros to the left. Stating that the earth's diameter is about 013 million meters doesn't give more information than stating it is about 13 million meters.

Updated the docstring a bit. This is mostly for use by the preparser, though of course it gets used directly as well.

I was thinking of scientific notation syntax, too. (Also the examples do not contain that form.)

nexttime · 2010-05-27T23:40:39Z

comment:14

Replying to @nexttime:

Not really. We were (only I think) considering zeros of the fractional part.

To be more precise, 0.0 and 00.0 denote the same "thing", i.e. have the same meaning, while 0.0 and 0.00 do not.

jasongrout · 2010-05-28T00:14:36Z

comment:15

For those interested in printing real numbers in general, I'll just plug #7682, which could use just a bit of polishing work, I believe, and makes printing real/complex numbers much more consistent and easy to adjust.

nexttime · 2010-05-28T01:00:13Z

comment:16

Replying to @robertwb:

Replying to @nexttime:

Also, is it intentional that the exponent is ignored when computing the required precision?
E.g., 1.0e-1000000000 as well as 1.0e-10000000000 evaluate to zero, because of only 53 bits precision.

Yes, of course. The size of the exponent is completely orthogonal to the number of significant figures.

I don't agree either. If the "effective" exponent (the one in normalized form) exceeds the maximum, one should either increase prec accordingly or - more practical - raise an exception.

If not, this could be a pitfall. But I don't mind... ;-)

robertwb · 2010-05-28T06:24:27Z

comment:17

Replying to @nexttime:

Replying to @robertwb:

That's the point of this ticket. For 0, all zeros are leading.

Not really. We were (only I think) considering zeros of the fractional part.

Truncation only makes sense from the right; and you don't express precision by padding zeros to the left. Stating that the earth's diameter is about 013 million meters doesn't give more information than stating it is about 13 million meters.

Which has the same information as saying the Earth's diameter is about 0.000013 trillion meters. Whether or not a digit is significant is not a function of whether it is on the left or right of the decimal.

Would you then say that 0.00e9 and 00.0e8 and .000e10 have different precisions?

robertwb · 2010-05-28T06:33:26Z

comment:18

Replying to @nexttime:

Just took a first glance:

create_ComplexNumber() now looks nice

in the above function, it's len<15 instead of len<=15

Ah, oops.

min_prec can be smaller than 53 (if specified)

Yes, but I'm not following what you're trying to say here.

pad should be (tested to be) non-negative

(min_prec perhaps too, or min_prec+pad>=0)

compare against min_prec+pad rather than individually for the "common" case

I don't care if pad is negative (I'll fix the docstring). Same with min_prec, I just pass whatever I get onto the RealField constructor which will raise a perfectly fine exception if the precision is not valid (as defined by MPFR_MIN_PREC and MPFR_MAX_PREC).

rnd description in create_RealNumber() is slightly messed up

Ah, I'll fix that.

E.g., 1.0e-1000000000 as well as 1.0e-10000000000 evaluate to zero, because of only 53 bits precision.

Yes, of course. The size of the exponent is completely orthogonal to the number of significant figures.

I don't agree either. If the "effective" exponent (the one in normalized form) exceeds the maximum, one should either increase prec accordingly or - more practical - raise an exception.

If not, this could be a pitfall. But I don't mind... ;-)

Well, I'd say the above has only two significant digits of precision, no matter what the exponent, which is all this function tries to deduce. The fact that it's subnormal is outside of the scope of this ticket--if an exception should be raise it's not here. (And in this case you can't raise the precision enough, due to memory/library limitations.)

nexttime · 2010-05-28T09:28:00Z

comment:19

Replying to @robertwb:

Replying to @nexttime:

min_prec can be smaller than 53 (if specified)

Yes, but I'm not following what you're trying to say here.

pad should be (tested to be) non-negative

(min_prec perhaps too, or min_prec+pad>=0)

compare against min_prec+pad rather than individually for the "common" case

I don't care if pad is negative (I'll fix the docstring). Same with min_prec, I just pass whatever I get onto the RealField constructor which will raise a perfectly fine exception if the precision is not valid (as defined by MPFR_MIN_PREC and MPFR_MAX_PREC).

If you test for min_prec+pad==53 and ..., more inputs fall into the simple common case (RR).
(Using RR, i.e. 53 bit mantissa, if the sum is less than 53 is perhaps not desired.)

nexttime · 2010-05-28T12:47:02Z

comment:20

Replying to @nexttime:

(Using RR, i.e. 53 bit mantissa, if the sum is less than 53 is perhaps not desired.)

Must have been some spot on the screen that covered min_... ;-)

Of course it's pretty ok to return RR in that case, too. (If someone really wants less precision, he can use RealField(prec)("...") directly.)

robertwb · 2011-01-26T06:44:11Z

comment:21

Attachment: 8896-doctests.patch.gz

Doctest fixes for 4.6.1.

robertwb · 2011-05-31T07:03:39Z

Folded and rebased on 4.7. Apply only this patch.

sagetrac-mariah · 2011-06-06T17:38:28Z

comment:22

Attachment: 8896-rebased.patch.gz

Patch fixes problem. Ran 'make testlong'. All tests passed. Positive review!

sagetrac-mariah · 2011-06-06T17:38:28Z

Reviewer: Mariah Lenox

sagetrac-mariah · 2011-06-06T17:38:28Z

Author: Robert Bradshaw

robertwb · 2011-06-06T17:47:41Z

comment:23

Thanks!

jdemeyer · 2011-06-08T07:32:18Z

comment:25

In my opinion, this ticket is a huge can of worms and a bad idea. I don't see any mathematically consistent reason who 0.0 should be treated differently than 0.0000000000000.

I really think the current behaviour of Sage is what makes the most sense (mathematically) so I am not in favour of this ticket. Of course, if the majority thinks this patch is a good idea then I'm all for it.

One thing about the patch which is very unclear is why zero is treated differently from other numbers. Consider::

sage: (0.0).prec()
53
sage: (0.1).prec()
53
sage: (000000000000000000000.0).prec()
77
sage: (000000000000000000000.1).prec()
53
sage: (0.000000000000000000000).prec()
77
sage: (0.000000000000000000001).prec()
53

robertwb · 2011-06-08T15:41:09Z

comment:26

Replying to @jdemeyer:

In my opinion, this ticket is a huge can of worms and a bad idea. I don't see any mathematically consistent reason who 0.0 should be treated differently than 0.0000000000000.

It's exactly the same reason that 1.0 is treated differently than 1.0000000000000000000000. The (value-preserving) trailing zeros indicate higher precision.

I really think the current behaviour of Sage is what makes the most sense (mathematically) so I am not in favour of this ticket. Of course, if the majority thinks this patch is a good idea then I'm all for it.

One thing about the patch which is very unclear is why zero is treated differently from other numbers.

Because otherwise there's now way to write a high-precision zero (in fact I can't think of any other interpretation of a large number of trailing zeros).

Consider::

sage: (0.0).prec()
53
sage: (0.1).prec()
53
sage: (000000000000000000000.0).prec()
77
sage: (000000000000000000000.1).prec()
53
sage: (0.000000000000000000000).prec()
77
sage: (0.000000000000000000001).prec()
53

kcrisman · 2011-06-08T15:56:54Z

comment:27

Another argument is that the BDFL suggested this change fairly strongly when opening this ticket :)

More seriously, I do have a question here.
Robert's reasoning seems sound in the following example - we want a high-precision zero, and this is the best way to acquire it.

> sage: (0.000000000000000000000).prec()
> 77
> sage: (0.000000000000000000001).prec()
> 53

But I agree with Jeroen that the following seems problematic. Didn't you say yourself (robertwb) that leading zeros before the decimal point shouldn't make a difference? So in the case here maybe we should have the default precision...

> sage: (000000000000000000000.0).prec()
> 77
> sage: (000000000000000000000.1).prec()
> 53

Keeping in mind that I have very little invested in either outcome for now, so take these only as observations.

jdemeyer · 2011-06-08T16:01:57Z

comment:28

Replying to @robertwb:

Replying to @jdemeyer:

In my opinion, this ticket is a huge can of worms and a bad idea. I don't see any mathematically consistent reason who 0.0 should be treated differently than 0.0000000000000.

It's exactly the same reason that 1.0 is treated differently than 1.0000000000000000000000. The (value-preserving) trailing zeros indicate higher precision.

It is absolutely not the same because with non-zero numbers only digits after a non-zero digit contribute to the precision. Indeed, 0.437 and 0.00000000000437 have exactly the same precision.

Because otherwise there's now way to write a high-precision zero (in fact I can't think of any other interpretation of a large number of trailing zeros).

Well, you can always do:

sage: zero = RealField(1000)(0)
sage: zero.prec()
1000

An other way to demonstrate the problem with this patch is that truncating decimal numbers (i.e. removing digits from the right) can increase the precision, which is very unlogical::

sage: (0.000000000000000000001234567890123456789).prec()
67
sage: (0.00000000000000000000123456789012345678).prec()
64
sage: (0.000000000000000000001).prec()
53
sage: (0.00000000000000000000).prec()
74

robertwb · 2011-06-08T16:19:58Z

comment:29

Replying to @jdemeyer:

Replying to @robertwb:

Replying to @jdemeyer:

In my opinion, this ticket is a huge can of worms and a bad idea. I don't see any mathematically consistent reason who 0.0 should be treated differently than 0.0000000000000.

It's exactly the same reason that 1.0 is treated differently than 1.0000000000000000000000. The (value-preserving) trailing zeros indicate higher precision.

It is absolutely not the same because with non-zero numbers only digits after a non-zero digit contribute to the precision. Indeed, 0.437 and 0.00000000000437 have exactly the same precision.

This hinges on the ambiguity of the term "trialing zero." Sage interprets trailing zeros after the decimal point as additional precision. To me, trailing zeros are those that are not followed by a non-zero digit, to you trailing zeros must also follow a non-zero digit.

Because otherwise there's now way to write a high-precision zero (in fact I can't think of any other interpretation of a large number of trailing zeros).

Well, you can always do:
sage: zero = RealField(1000)(0)
sage: zero.prec()
1000

Yeah, but that's a pain, but to me the fact that 0.00000000000000000000000000000000 is not high precision is more surprising, as it has many significant digits.

An other way to demonstrate the problem with this patch is that truncating decimal numbers (i.e. removing digits from the right) can increase the precision, which is very unlogical::
sage: (0.000000000000000000001234567890123456789).prec()
67
sage: (0.00000000000000000000123456789012345678).prec()
64
sage: (0.000000000000000000001).prec()
53
sage: (0.00000000000000000000).prec()
74

It also changes the relative value by a huge amount :)

Is it worth taking this discussion to sage-devel?

jdemeyer · 2011-06-08T19:07:45Z

comment:30

Replying to @robertwb:

to me the fact that 0.00000000000000000000000000000000 is not high precision is more surprising, as it has many significant digits.

I would say that it does have a lot of digits but that none of them is a significant digit.

williamstein · 2011-08-24T23:43:21Z

Changed keywords from none to sd32

williamstein added this to the sage-4.7.1 milestone May 5, 2010

williamstein added c: basic arithmetic labels May 5, 2010

williamstein assigned aghitza May 5, 2010

robertwb added the s: needs review label May 15, 2010

kcrisman added s: needs work and removed s: needs review labels May 26, 2010

robertwb added s: needs review and removed s: needs work labels May 27, 2010

sagetrac-mariah mannequin added s: positive review and removed s: needs review labels Jun 6, 2011

0.0000000000000000000000000000 is parsed completely differently than 1.0000000000000000000000000000 for no good reason #8896

0.0000000000000000000000000000 is parsed completely differently than 1.0000000000000000000000000000 for no good reason #8896

Comments

williamstein commented May 5, 2010

robertwb commented May 5, 2010

nexttime mannequin commented May 6, 2010

robertwb commented May 15, 2010

kcrisman commented May 26, 2010

nexttime mannequin commented May 27, 2010

nexttime mannequin commented May 27, 2010

nexttime mannequin commented May 27, 2010

robertwb commented May 27, 2010

robertwb commented May 27, 2010

robertwb commented May 27, 2010

kcrisman commented May 27, 2010

nexttime mannequin commented May 27, 2010

nexttime mannequin commented May 27, 2010

nexttime mannequin commented May 27, 2010

jasongrout commented May 28, 2010

nexttime mannequin commented May 28, 2010

robertwb commented May 28, 2010

robertwb commented May 28, 2010

nexttime mannequin commented May 28, 2010

nexttime mannequin commented May 28, 2010

robertwb commented Jan 26, 2011

robertwb commented May 31, 2011

sagetrac-mariah mannequin commented Jun 6, 2011

sagetrac-mariah mannequin commented Jun 6, 2011

sagetrac-mariah mannequin commented Jun 6, 2011

robertwb commented Jun 6, 2011

This comment has been minimized.

jdemeyer commented Jun 8, 2011

robertwb commented Jun 8, 2011

kcrisman commented Jun 8, 2011

jdemeyer commented Jun 8, 2011

robertwb commented Jun 8, 2011

jdemeyer commented Jun 8, 2011

This comment has been minimized.

williamstein commented Aug 24, 2011