Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new float-parser based on eisel-lemire algorithm #11566

Merged
merged 5 commits into from
May 3, 2022

Conversation

tiehuis
Copy link
Member

@tiehuis tiehuis commented May 2, 2022

Pushed this a bit earlier than I intended since progress in other areas was blocked by #11169. I have run a few local tests and modified the one use of parseHexFloat in stage2 but haven't run all tests yet. Further I would be interested in differences between stage1 and stage 2 behavior in regards to rounding. May not impact any tests.

Note that there is one failing test-case for f16 that I haven't fixed. Specifically it is this entry | f16: 0000 33000000 3E60000000000000 2.98023223876953125E-8: found 0x1. I don't consider this a blocker.

For more information, see this repo: https://github.com/tiehuis/zig-parsefloat

commit message

The previous float-parsing method was lacking in a lot of areas. This
commit introduces a state-of-the art implementation that is both
accurate and fast to std.

Code is derived from working repo https://github.com/tiehuis/zig-parsefloat.
This includes more test-cases and performance numbers that are present
in this commit.

  • Accuracy

The primary testing regime has been using test-data found at
https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of
upstream with support for f128 test-cases added. This data has been
verified against other independent implementations and represents
accurate round-to-even IEEE-754 floating point semantics.

  • Performance

Compared to the existing parseFloat implementation there is ~5-10x
performance improvement using the above corpus. (f128 parsing excluded
in below measurements).

** Old

$ time ./test_all_fxx_data
3520298/5296694 succeeded (1776396 fail)

________________________________________________________
Executed in   28.68 secs    fish           external
   usr time   28.48 secs    0.00 micros   28.48 secs
   sys time    0.08 secs  694.00 micros    0.08 secs

** This Implementation

$ time ./test_all_fxx_data
5296693/5296694 succeeded (1 fail)

________________________________________________________
Executed in    4.54 secs    fish           external
   usr time    4.37 secs  515.00 micros    4.37 secs
   sys time    0.10 secs  171.00 micros    0.10 secs

Further performance numbers can be seen using the
https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which
compares against some other well-known string-to-float conversion
functions. A breakdown can be found here:

https://github.com/tiehuis/zig-parsefloat/blob/0d9f020f1a37ca88bf889703b397c1c41779f090/PERFORMANCE.md#commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706

In summary, we are within 20% of the C++ reference implementation and
have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz.

  • F128 Support

Finally, f128 is now completely supported with full accuracy. This does
use a slower path which is possible to improve in future.

  • Behavioural Changes

There are a few behavioural changes to note.

  • parseHexFloat is now redundant and these are now supported directly
    in parseFloat.
  • We implement round-to-even in all parsing routines. This is as
    specified by IEEE-754. Previous code used different rounding
    mechanisms (standard was round-to-zero, hex-parsing looked to use
    round-up) so there may be subtle differences.

Closes #2207.
Fixes #11169.

@@ -405,8 +37,8 @@ test "fmt.parseFloat" {
try expect(approxEqAbs(T, try parseFloat(T, "3.141"), 3.141, epsilon));
try expect(approxEqAbs(T, try parseFloat(T, "-3.141"), -3.141, epsilon));

try expectEqual(try parseFloat(T, "1e-700"), 0);
try expectEqual(try parseFloat(T, "1e+700"), std.math.inf(T));
try expectEqual(try parseFloat(T, "1e-5000"), 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified since we now support f128 with correct bounds for inf so the previous value is now in range and doesn't overflow.

try testing.expectEqual(try parseFloat(f128, "-0x1p-16494"), -math.floatTrueMin(f128));

// NOTE: We are performing round-to-even. Previous behavior was round-up.
// try testing.expectEqual(try parseFloat(f128, "0x1.edcb34a235253948765432134674fp-1"), 0x1.edcb34a235253948765432134674fp-1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a specific example of where I presume the rounding difference is being a factor since we are off by one here. Haven't done a close verification but once stage2 is the base compiler and we have consistent rounding across all implementations these errors should cease.

.mantissa_explicit_bits = std.math.floatMantissaBits(T),
.infinite_power = 0x1f,
// Eisel-Lemire
.smallest_power_of_ten = -26, // TODO: refine, fails one test
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note refers to one failing test case for f16: | f16: 0000 33000000 3E60000000000000 2.98023223876953125E-8: found 0x1

return ((q *% (152170 + 65536)) >> 16) + 63;
}

const U128 = struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this type and should use u128 directly. I will likely follow up with this change after measuring performance.

return null;
}

// TODO: x86 (no SSE/SSE2) requires x87 FPU to be setup correctly with fldcw
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is noted in the algorithm's paper. Can see an example implementation in the rust tree: https://github.com/rust-lang/rust/blob/6b6c1ffacc5df738b3560369746d87499adbeae1/library/core/src/num/dec2flt/fpu.rs

Skipped implementation for the moment since I do not have a test setup with this, but should be easy to implement.

@tiehuis
Copy link
Member Author

tiehuis commented May 2, 2022

I expect adding support for f80 in future should be fairly straight-forward (slow-method) since there should be no major internal changes now required except for handling the explicit bit.

@tiehuis
Copy link
Member Author

tiehuis commented May 2, 2022

Will review test failures tomorrow. I suspect rounding differences but TBD.

@andrewrk
Copy link
Member

andrewrk commented May 2, 2022

I had a look at the failures in your branch - most of the problems are solved by 9af4cad which I just pushed to master. There was one remaining problem which is test "quad hex float literal parsing accurate" from test/behavior/math.zig. This one could probably use your expertise.

@tiehuis
Copy link
Member Author

tiehuis commented May 2, 2022

So this specific test-case looks incorrect. 0x1.edcb34a235253948765432134674fp-1 should be represented as 0x1.edcb34a235253948765432134675p-1, rounding up and not 0x1.edcb34a235253948765432134674p-1 which is what is incorrectly parsed by the current hex float parser. This is obvious given the last truncated digit that is used in the rounding process (f).

I've included a small script as well to see the difference between the f128 values which bound the true value, in decimal form:

from decimal import *
import math

getcontext().prec = 64

af = "0x1.edcb34a235253948765432134674fp-1"
accurate = Decimal(0.1) * (Decimal(1) +
    Decimal(0xedcb34a235253948765432134674f) /
    Decimal(0x100000000000000000000000000000))


lf = "0x1.edcb34a235253948765432134674p-1"
lo = Decimal(0.1) * (Decimal(1) +
    Decimal(0xedcb34a235253948765432134674) /
    Decimal(0x10000000000000000000000000000))

hf = "0x1.edcb34a235253948765432134675p-1"
hi = Decimal(0.1) * (Decimal(1) +
    Decimal(0xedcb34a235253948765432134675) /
    Decimal(0x10000000000000000000000000000))

print(hf, abs(accurate-hi))
print(lf, abs(accurate-lo))
('0x1.edcb34a235253948765432134675p-1', Decimal('1.2037062152420224749791039664E-36'))
('0x1.edcb34a235253948765432134674p-1', Decimal('1.80555932286303371246865594964E-35'))

The issue looks to have been introduced here: 7f024d6

Specifically the check if (guardBit and (exactly_half or more_than_half)) {. It looks like this should have been if ((guardBit and exactly_half) or more_than_half) { based on the comments about how the rounding process was meant to function. Modifying the existing parser with these changes makes them agree. The current rounding behavior appears to be round-up, only for odd values.

The "fix" for this PR (that I will test later) should be modifying the test-case to use the correctly rounded bit representation:

try expect(@bitCast(u128, f) == 0x3ffeedcb34a235253948765432134675);

tiehuis added 3 commits May 3, 2022 16:46
The previous float-parsing method was lacking in a lot of areas. This
commit introduces a state-of-the art implementation that is both
accurate and fast to std.

Code is derived from working repo https://github.com/tiehuis/zig-parsefloat.
This includes more test-cases and performance numbers that are present
in this commit.

* Accuracy

The primary testing regime has been using test-data found at
https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of
upstream with support for f128 test-cases added. This data has been
verified against other independent implementations and represents
accurate round-to-even IEEE-754 floating point semantics.

* Performance

Compared to the existing parseFloat implementation there is ~5-10x
performance improvement using the above corpus. (f128 parsing excluded
in below measurements).

** Old

    $ time ./test_all_fxx_data
    3520298/5296694 succeeded (1776396 fail)

    ________________________________________________________
    Executed in   28.68 secs    fish           external
       usr time   28.48 secs    0.00 micros   28.48 secs
       sys time    0.08 secs  694.00 micros    0.08 secs

** This Implementation

    $ time ./test_all_fxx_data
    5296693/5296694 succeeded (1 fail)

    ________________________________________________________
    Executed in    4.54 secs    fish           external
       usr time    4.37 secs  515.00 micros    4.37 secs
       sys time    0.10 secs  171.00 micros    0.10 secs

Further performance numbers can be seen using the
https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which
compares against some other well-known string-to-float conversion
functions. A breakdown can be found here:

https://github.com/tiehuis/zig-parsefloat/blob/0d9f020f1a37ca88bf889703b397c1c41779f090/PERFORMANCE.md#commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706

In summary, we are within 20% of the C++ reference implementation and
have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz.

* F128 Support

Finally, f128 is now completely supported with full accuracy. This does
use a slower path which is possible to improve in future.

* Behavioural Changes

There are a few behavioural changes to note.

 - `parseHexFloat` is now redundant and these are now supported directly
   in `parseFloat`.
 - We implement round-to-even in all parsing routines. This is as
   specified by IEEE-754. Previous code used different rounding
   mechanisms (standard was round-to-zero, hex-parsing looked to use
   round-up) so there may be subtle differences.

Closes ziglang#2207.
Fixes ziglang#11169.
This is only to get tests running again. The root issue should be fixed
in stage1 so rounding is consistent between stages.
@andrewrk
Copy link
Member

andrewrk commented May 3, 2022

Looks like the failure is this one:

/workspace/lib/std/fmt/parse_float.zig:43:9: 0x5887a7 in fmt.parse_float.test "std-aarch64-linux-none-Debug-bare-multi fmt.parseFloat" (test)
        try expectEqual(@bitCast(Z, try parseFloat(T, "nAn")), @bitCast(Z, std.math.nan(T)));
        ^

You should be able to reproduce it like this:

zig test lib/std/std.zig -target aarch64-linux -femit-bin=test
qemu-aarch64 ./test

@tiehuis
Copy link
Member Author

tiehuis commented May 3, 2022

Current CI failure. Missing a few new compiler_rt functions for f128 to u128 conversion.

ld.lld: error: undefined symbol: __floatuntikf
>>> referenced by convert_fast.zig:111 (/workspace/lib/std/fmt/parse_float/convert_fast.zig:111)
>>>               /workspace/zig-cache/o/914dcaca478e0d36f82aff97c6159633/test.o:(fmt.parse_float.convert_fast.convertFast.11039)
>>> referenced by convert_fast.zig:123 (/workspace/lib/std/fmt/parse_float/convert_fast.zig:123)
>>>               /workspace/zig-cache/o/914dcaca478e0d36f82aff97c6159633/test.o:(fmt.parse_float.convert_fast.convertFast.11039)
>>> did you mean: __floatundikf
>>> defined in: /root/.cache/zig/o/c421352bc02d4a423a559e888dbda3f7/libcompiler_rt.a(/root/.cache/zig/o/c421352bc02d4a423a559e888dbda3f7/compiler_rt.o)
thread 61118 panic: attempt to unwrap error: LLDReportedFailure
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/src/link/Elf.zig:1795:29: 0x33a59f7 in link.Elf.linkWithLLD (zig1)
                            return error.LLDReportedFailure;
                            ^
/workspace/src/link/Elf.zig:946:9: 0x3392100 in link.Elf.flush (zig1)
        return self.linkWithLLD(comp, prog_node);
        ^
/workspace/src/link.zig:602:21: 0x338f044 in link.File.flush (zig1)
            .elf => return @fieldParentPtr(Elf, "base", base).flush(comp, prog_node),
                    ^
/workspace/src/Compilation.zig:2189:5: 0x335c257 in Compilation.flush (zig1)
    try comp.bin_file.flush(comp, prog_node); // This is needed before reading the error flags.
    ^
/workspace/src/Compilation.zig:2158:13: 0x33519e5 in Compilation.update (zig1)
            try comp.flush(main_progress_node);
            ^
/workspace/src/main.zig:3085:5: 0x32dd12d in main.updateModule (zig1)
    try comp.update();
    ^
/workspace/src/main.zig:2776:21: 0x32361aa in main.buildOutputType (zig1)
        else => |e| return e,
                    ^
/workspace/src/main.zig:225:9: 0x31ee258 in main.mainArgs (zig1)
        return buildOutputType(gpa, arena, args, .zig_test);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
/workspace/lib/std/fs.zig:1285:9: 0x326f366 in std.fs.Dir.makeDir (zig1)
        try os.mkdirat(self.fd, sub_path, default_new_dir_mode);
        ^
/workspace/lib/std/os.zig:2708:19: 0x326f59d in std.os.mkdiratZ (zig1)
        .EXIST => return error.PathAlreadyExists,
                  ^
/workspace/lib/std/os.zig:2668:9: 0x326f45c in std.os.mkdirat (zig1)
        return mkdiratZ(dir_fd, &sub_dir_path_c, mode);
        ^
???:?:?: 0x31efa42 in ??? (???)
/workspace/src/stage1.zig:48:43: 0x31edaa9 in main (zig1)
        stage2.mainArgs(gpa, arena, args) catch unreachable;
                                          ^
???:?:?: 0x8715a30 in ??? (???)
test...The following command terminated unexpectedly:
/workspace/_debug/staging/bin/zig test /workspace/lib/std/std.zig --test-name-prefix std-powerpc-linux-none-Debug-bare-multi  --cache-dir /workspace/zig-cache --global-cache-dir /root/.cache/zig --name test -fno-single-threaded -target powerpc-linux-none -mcpu ppc --test-cmd qemu-ppc --test-cmd-bin -I /workspace/test --zig-lib-dir /workspace/lib --enable-cache 
error: the following build command failed with exit code 6:
/workspace/zig-cache/o/318f55ba7ce810cb55d4861bb3f11572/build /workspace/_debug/staging/bin/zig /workspace /workspace/zig-cache /root/.cache/zig test-std -fqemu -fwasmtime

@tiehuis
Copy link
Member Author

tiehuis commented May 3, 2022

Reproduce locally with zig test lib/std/std.zig --zig-lib-dir lib -target powerpc-linux-none -mcpu ppc --test-cmd qemu-ppc --test-cmd-bin

PPC __floatuntikf is equivalent to X86 __floatuntitf.
@andrewrk andrewrk merged commit 6317da8 into ziglang:master May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std.fmt.parseFloat does not parse 9007199254740993.0 correctly robust float parsing in the standard library
2 participants