`std.fmt.parseFloat` does not parse 9007199254740993.0 correctly #11169

andrewrk · 2022-03-15T05:19:43Z

Implementation is here: https://github.com/ziglang/zig/blob/master/lib/std/fmt/parse_float.zig

--- a/lib/std/fmt/parse_float.zig
+++ b/lib/std/fmt/parse_float.zig
@@ -425,4 +425,6 @@ test "fmt.parseFloat" {
             try expect(approxEqAbs(T, try parseFloat(T, "2.71828182845904523536"), @as(T, 2.718281828459045), epsilon));
         }
     }
+
+    try expectEqual(try parseFloat(f128, "9007199254740993.0"), 9007199254740993.0);
 }

$ ./zig test ../lib/std/fmt/parse_float.zig 
Test [1/1] test "parseFloat"... expected 9.007199254740994e+15, found 9.007199254740992e+15
Test [1/1] test "parseFloat"... FAIL (TestExpectedEqual)

This is causing a behavior test failure in the self-hosted compiler, which relies on std.fmt.parseFloat:

zig/test/behavior/floatop.zig

Line 679 in 9eceba2

try expect(9007199254740992.0 + 1.0 == 9007199254740993.0);

The text was updated successfully, but these errors were encountered:

curtiswilkinson · 2022-03-15T06:50:42Z

I've done a little bit of digging:

This code was ported from https://github.com/grzegorz-kraszewski/stringtofloat

When I run that in C on our example float string I'm getting the 4 at the end that we're seeing with the zig port (vs the 2 from stage1, which I believe uses a port of musl for float parsing?).

int main() {
  printf("%fn\n", str2dbl("9007199254740993.0"));
  return 0;
};

This could potentially be what the important comment in the C reference implementation is talking about: https://github.com/grzegorz-kraszewski/stringtofloat/blob/master/str2dbl.c#L7

/* The code works in "round towards zero" mode. This is different from  */
/* GCC standard library strtod(), which uses "round half to even" rule. */
/* Therefore it cannot be used as a direct drop-in replacement, as in   */
/* some cases results will be different on the least significant bit of */
/* mantissa. Read more in the README.md file.

andrewrk · 2022-03-15T06:52:45Z

Here's the best path forward in my opinion:

ditch grzegorz-kraszewski/stringtofloat, forget about it
port musl's f128 parsing code. I already did half the work here: parse_f128.c, it just needs to be ported to zig

danielchasehooper · 2022-03-17T21:26:46Z

I'm working on this now. Unable to assign it to myself.

matu3ba · 2022-03-17T22:30:29Z

@danielchasehooper I have a wip though I should have used more vim macros for porting to get it over finish line:
https://gist.github.com/matu3ba/040ef3ad23f53213ad277999eb847586

Feel free to continue or rip out what you need.

The previous float-parsing method was lacking in a lot of areas. This commit introduces a state-of-the art implementation that is both accurate and fast to std. Code is derived from working repo https://github.com/tiehuis/zig-parsefloat. This includes more test-cases and performance numbers that are present in this commit. * Accuracy The primary testing regime has been using test-data found at https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of upstream with support for f128 test-cases added. This data has been verified against other independent implementations and represents accurate round-to-even IEEE-754 floating point semantics. * Performance Compared to the existing parseFloat implementation there is ~5-10x performance improvement using the above corpus. (f128 parsing excluded in below measurements). ** Old $ time ./test_all_fxx_data 3520298/5296694 succeeded (1776396 fail) ________________________________________________________ Executed in 28.68 secs fish external usr time 28.48 secs 0.00 micros 28.48 secs sys time 0.08 secs 694.00 micros 0.08 secs ** This Implementation $ time ./test_all_fxx_data 5296693/5296694 succeeded (1 fail) ________________________________________________________ Executed in 4.54 secs fish external usr time 4.37 secs 515.00 micros 4.37 secs sys time 0.10 secs 171.00 micros 0.10 secs Further performance numbers can be seen using the https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which compares against some other well-known string-to-float conversion functions. A breakdown can be found here: https://github.com/tiehuis/zig-parsefloat/blob/0d9f020f1a37ca88bf889703b397c1c41779f090/PERFORMANCE.md#commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706 In summary, we are within 20% of the C++ reference implementation and have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz. * F128 Support Finally, f128 is now completely supported with full accuracy. This does use a slower path which is possible to improve in future. * Behavioural Changes There are a few behavioural changes to note. - `parseHexFloat` is now redundant and these are now supported directly in `parseFloat`. - We implement round-to-even in all parsing routines. This is as specified by IEEE-754. Previous code used different rounding mechanisms (standard was round-to-zero, hex-parsing looked to use round-up) so there may be subtle differences. Closes ziglang#2207. Fixes ziglang#11169.

andrewrk added bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. standard library This issue involves writing Zig code for the standard library. labels Mar 15, 2022

andrewrk added this to the 0.10.0 milestone Mar 15, 2022

andrewrk added the frontend Tokenization, parsing, AstGen, Sema, and Liveness. label Mar 15, 2022

andrewrk mentioned this issue Mar 15, 2022

self-hosted compiler: ship it! #89

Closed

49 tasks

danielchasehooper mentioned this issue Mar 21, 2022

Add support for f128 values in std.fmt.parseFloat. #11254

Closed

tiehuis mentioned this issue May 2, 2022

add new float-parser based on eisel-lemire algorithm #11566

Merged

andrewrk closed this as completed in #11566 May 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`std.fmt.parseFloat` does not parse 9007199254740993.0 correctly #11169

`std.fmt.parseFloat` does not parse 9007199254740993.0 correctly #11169

andrewrk commented Mar 15, 2022

curtiswilkinson commented Mar 15, 2022

andrewrk commented Mar 15, 2022

danielchasehooper commented Mar 17, 2022

matu3ba commented Mar 17, 2022 •

edited

Loading

std.fmt.parseFloat does not parse 9007199254740993.0 correctly #11169

std.fmt.parseFloat does not parse 9007199254740993.0 correctly #11169

Comments

andrewrk commented Mar 15, 2022

curtiswilkinson commented Mar 15, 2022

andrewrk commented Mar 15, 2022

danielchasehooper commented Mar 17, 2022

matu3ba commented Mar 17, 2022 • edited Loading

`std.fmt.parseFloat` does not parse 9007199254740993.0 correctly #11169

`std.fmt.parseFloat` does not parse 9007199254740993.0 correctly #11169

matu3ba commented Mar 17, 2022 •

edited

Loading