Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression caused by PR #13276: echo 0.6 show 0.59999999999999998, unlike all other languages #13362

Closed
timotheecour opened this issue Feb 8, 2020 · 4 comments

Comments

@timotheecour
Copy link
Member

timotheecour commented Feb 8, 2020

  • the recently merged printing float values will have one more digit. #13276 introduced the regression /cc @krux02

  • that PR commented out some tests and changed some numbers for preexisting tests in tests/system/tostring.nim to deal with the fact that numbers now print weirdly; this will likely break other people's tests in the wild

  • pretty much no other language I know of does that, and for good reason:

python, D, C, C++, go, node js, matlab, octave, java...

yet they can handle serialization/deserialization roundtrip just fine, and aren't any less correct; printing 0.6 as 0.6 does not imply any less precision !

  • this is bad for interop with other languages; eg it can blow up size of json messages, etc; it affects AST rendering, doc generation and countless other things

Example

echo 0.6

Current Output

0.59999999999999998

Expected Output

0.6

Additional Information

that PR aimed to fix #13196 but IMO there should be a better fix that doesn't introduce this regression. EDIT: indeed, see #13364

Indeed, other languages I've tried don't have that serialization/deserialization round-trip issue;

and pretty much all languages I'm familiar with print float literals like 0.4 as simply 0.4 (using default printing command):

  • python
a=0.4
print(a) # 0.4
assert(str(a) == '0.4')
import json
x = 0.12345678901234567890123456789
json.loads(json.dumps(x)) == x
  • D
import std.stdio;
import std.json;
import std.conv;

void main(){
  auto a = 0.4;
  writeln(a);
  assert(a.to!string == "0.4");
  auto x = 0.12345678901234567890123456789;
  auto x2 = parseJSON(JSONValue(x).to!string).floating();
  assert(x2 == x);
}
  • C and C++ (I only checked printing, not json serialization/deserialization)
#include <iostream>
#include <stdio.h>
int main (int argc, char *argv[]) {
  double a = 0.4;
  std::cout << a << std::endl; // (C++, C) prints as 0.4
  printf("%g\n", a); // (C) prints as 0.4
  return 0;
}
  • go
package main
import "fmt"
func main() {
	var a = 0.4
	fmt.Println(a) # prints 0.4
}
  • matlab / octave: ditto
  • node js: ditto

consequences

  • this may break other people's tests that relied on stringifications of simple looking numbers like echo (0.4, "foo")

  • this will blow up size of json files which impacts performance etc

  import std/json
  var s = @[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
  echo s
  var j = %* s
  echo j

now shows:

@[0.10000000000000001, 0.20000000000000001, 0.29999999999999999, 0.40000000000000002, 0.5, 0.59999999999999998, 0.69999999999999996, 0.80000000000000004, 0.90000000000000002]
[0.10000000000000001,0.20000000000000001,0.29999999999999999,0.40000000000000002,0.5,0.59999999999999998,0.69999999999999996,0.80000000000000004,0.90000000000000002]
  • this affects nim doc runnableExamples, eg:
  proc bar*()=
    runnableExamples:
      var x = 0.6
      doAssert x == 0.6

shows as:

image

  • this affects the AST as well, eg:
  import macros
  macro dbg(a): untyped =
    echo a.repr
    echo a.treeRepr
  dbg:
    let a = 0.6

now prints:

let a = 0.59999999999999998
StmtList
  LetSection
    IdentDefs
      Ident "a"
      Empty
      FloatLit 0.59999999999999998

proposal

@timotheecour timotheecour changed the title regression caused by recent PR #13276 (printing float): float litterals + printing broken; eg $0.6 is now 0.59999999999999998 regression caused by recent PR #13276 (printing float): echo 0.6 show 0.59999999999999998 (+ other issues) Feb 8, 2020
@timotheecour timotheecour changed the title regression caused by recent PR #13276 (printing float): echo 0.6 show 0.59999999999999998 (+ other issues) regression caused by PR #13276: echo 0.6 show 0.59999999999999998 (+ other issues), unlike all other languages Feb 8, 2020
@timotheecour timotheecour changed the title regression caused by PR #13276: echo 0.6 show 0.59999999999999998 (+ other issues), unlike all other languages regression caused by PR #13276: echo 0.6 show 0.59999999999999998, unlike all other languages Feb 8, 2020
@kaushalmodi
Copy link
Contributor

kaushalmodi commented Feb 8, 2020

The amount of research you did for this regression issue is amazing! 👏

@krux02
Copy link
Contributor

krux02 commented Feb 8, 2020

[0.10000000000000001,0.20000000000000001,0.29999999999999999,0.40000000000000002,0.5,0.59999999999999998,0.69999999999999996,0.80000000000000004,0.90000000000000002]

this will blow up size of json files

Correction, this will blow up size for json, if the value comes from a handwritten decimal number. To my experience the majority of json number data doesn't come from floating point numbers entered manually in decimal.

this may break other people's tests that relied on stringifications of simple looking numbers

True. And I am sorry for that.

Regarding the output of [0.10000000000000001,0.20000000000000001,0.29999999999999999,0.40000000000000002,0.5,0.59999999999999998,0.69999999999999996,0.80000000000000004,0.90000000000000002] instead of [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9], I personally think this is an improvement. It is a very common mistake from beginners to think 0.1 * 10 == 1, which is not the case. When echo 0.1 does print 0.10000000000000001 instead of 0.1 it might be clear that 0.1 * 10 isn't 1. In other words I think printing 0.1 as 0.1 precisely is a lie from the language, because the internal value isn't 0.1 precisely.

regarding printf("%g\n", a), this one doesn't survive the roundtrip.

#include <math.h>
#include <stdio.h>

int main() {
  double f = 0.1;
  for(int i = 0; i < 100; ++i) {
    printf("%g\n", f);
    f = nextafter(f, 100);
  }
}

output: only 0.1, nothing else.
same with C++

#include <cmath>
#include <iostream>

using namespace std;

int main() {
  double f = 0.1;
  for(int i = 0; i < 100; ++i) {
    cout << f << endl;
    f = nextafter(f, 100);
  }
}

@timotheecour
Copy link
Member Author

timotheecour commented Feb 8, 2020

I personally think this is an improvement. It is a very common mistake from beginners to think 0.1 * 10 == 1, which is not the case

beginners need to learn about FP semantics. Just like in any language.

regarding printf("%g\n", a), this one doesn't survive the roundtrip.

there's no roundtrip here. A language's default echo command shouldn't be used for serialization; the roundtrip is for serialization (such as json/protobuf etc).
In any case #13365 (ryu) is the way forward, and would give you the "best of both worlds":

  • short representation
  • rountrip safe
  • performant
  • and would even allow echo itself to be rountrip-safe (like print in python), although this is a separate issue from doing it for things explicitly meant for serialization

since #13276 has been reverted, I'm gonna close this issue; followup work is to implement #13365

@krux02
Copy link
Contributor

krux02 commented Feb 9, 2020

beginners need to learn about FP semantics. Just like in any language.

That works best if the language is honest about FP

there's no roundtrip here.

Whatever...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants