-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add new float-parser based on eisel-lemire algorithm #11566
Conversation
@@ -405,8 +37,8 @@ test "fmt.parseFloat" { | |||
try expect(approxEqAbs(T, try parseFloat(T, "3.141"), 3.141, epsilon)); | |||
try expect(approxEqAbs(T, try parseFloat(T, "-3.141"), -3.141, epsilon)); | |||
|
|||
try expectEqual(try parseFloat(T, "1e-700"), 0); | |||
try expectEqual(try parseFloat(T, "1e+700"), std.math.inf(T)); | |||
try expectEqual(try parseFloat(T, "1e-5000"), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified since we now support f128 with correct bounds for inf so the previous value is now in range and doesn't overflow.
try testing.expectEqual(try parseFloat(f128, "-0x1p-16494"), -math.floatTrueMin(f128)); | ||
|
||
// NOTE: We are performing round-to-even. Previous behavior was round-up. | ||
// try testing.expectEqual(try parseFloat(f128, "0x1.edcb34a235253948765432134674fp-1"), 0x1.edcb34a235253948765432134674fp-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a specific example of where I presume the rounding difference is being a factor since we are off by one here. Haven't done a close verification but once stage2 is the base compiler and we have consistent rounding across all implementations these errors should cease.
.mantissa_explicit_bits = std.math.floatMantissaBits(T), | ||
.infinite_power = 0x1f, | ||
// Eisel-Lemire | ||
.smallest_power_of_ten = -26, // TODO: refine, fails one test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note refers to one failing test case for f16: | f16: 0000 33000000 3E60000000000000 2.98023223876953125E-8: found 0x1
return ((q *% (152170 + 65536)) >> 16) + 63; | ||
} | ||
|
||
const U128 = struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this type and should use u128
directly. I will likely follow up with this change after measuring performance.
return null; | ||
} | ||
|
||
// TODO: x86 (no SSE/SSE2) requires x87 FPU to be setup correctly with fldcw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noted in the algorithm's paper. Can see an example implementation in the rust tree: https://github.com/rust-lang/rust/blob/6b6c1ffacc5df738b3560369746d87499adbeae1/library/core/src/num/dec2flt/fpu.rs
Skipped implementation for the moment since I do not have a test setup with this, but should be easy to implement.
I expect adding support for |
Will review test failures tomorrow. I suspect rounding differences but TBD. |
I had a look at the failures in your branch - most of the problems are solved by 9af4cad which I just pushed to master. There was one remaining problem which is |
So this specific test-case looks incorrect. I've included a small script as well to see the difference between the f128 values which bound the true value, in decimal form: from decimal import *
import math
getcontext().prec = 64
af = "0x1.edcb34a235253948765432134674fp-1"
accurate = Decimal(0.1) * (Decimal(1) +
Decimal(0xedcb34a235253948765432134674f) /
Decimal(0x100000000000000000000000000000))
lf = "0x1.edcb34a235253948765432134674p-1"
lo = Decimal(0.1) * (Decimal(1) +
Decimal(0xedcb34a235253948765432134674) /
Decimal(0x10000000000000000000000000000))
hf = "0x1.edcb34a235253948765432134675p-1"
hi = Decimal(0.1) * (Decimal(1) +
Decimal(0xedcb34a235253948765432134675) /
Decimal(0x10000000000000000000000000000))
print(hf, abs(accurate-hi))
print(lf, abs(accurate-lo))
The issue looks to have been introduced here: 7f024d6 Specifically the check The "fix" for this PR (that I will test later) should be modifying the test-case to use the correctly rounded bit representation:
|
The previous float-parsing method was lacking in a lot of areas. This commit introduces a state-of-the art implementation that is both accurate and fast to std. Code is derived from working repo https://github.com/tiehuis/zig-parsefloat. This includes more test-cases and performance numbers that are present in this commit. * Accuracy The primary testing regime has been using test-data found at https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of upstream with support for f128 test-cases added. This data has been verified against other independent implementations and represents accurate round-to-even IEEE-754 floating point semantics. * Performance Compared to the existing parseFloat implementation there is ~5-10x performance improvement using the above corpus. (f128 parsing excluded in below measurements). ** Old $ time ./test_all_fxx_data 3520298/5296694 succeeded (1776396 fail) ________________________________________________________ Executed in 28.68 secs fish external usr time 28.48 secs 0.00 micros 28.48 secs sys time 0.08 secs 694.00 micros 0.08 secs ** This Implementation $ time ./test_all_fxx_data 5296693/5296694 succeeded (1 fail) ________________________________________________________ Executed in 4.54 secs fish external usr time 4.37 secs 515.00 micros 4.37 secs sys time 0.10 secs 171.00 micros 0.10 secs Further performance numbers can be seen using the https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which compares against some other well-known string-to-float conversion functions. A breakdown can be found here: https://github.com/tiehuis/zig-parsefloat/blob/0d9f020f1a37ca88bf889703b397c1c41779f090/PERFORMANCE.md#commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706 In summary, we are within 20% of the C++ reference implementation and have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz. * F128 Support Finally, f128 is now completely supported with full accuracy. This does use a slower path which is possible to improve in future. * Behavioural Changes There are a few behavioural changes to note. - `parseHexFloat` is now redundant and these are now supported directly in `parseFloat`. - We implement round-to-even in all parsing routines. This is as specified by IEEE-754. Previous code used different rounding mechanisms (standard was round-to-zero, hex-parsing looked to use round-up) so there may be subtle differences. Closes ziglang#2207. Fixes ziglang#11169.
This is only to get tests running again. The root issue should be fixed in stage1 so rounding is consistent between stages.
Looks like the failure is this one:
You should be able to reproduce it like this:
|
Current CI failure. Missing a few new compiler_rt functions for f128 to u128 conversion.
|
Reproduce locally with |
PPC __floatuntikf is equivalent to X86 __floatuntitf.
Pushed this a bit earlier than I intended since progress in other areas was blocked by #11169. I have run a few local tests and modified the one use of parseHexFloat in stage2 but haven't run all tests yet. Further I would be interested in differences between stage1 and stage 2 behavior in regards to rounding. May not impact any tests.
Note that there is one failing test-case for f16 that I haven't fixed. Specifically it is this entry
| f16: 0000 33000000 3E60000000000000 2.98023223876953125E-8: found 0x1
. I don't consider this a blocker.For more information, see this repo: https://github.com/tiehuis/zig-parsefloat
commit message
The previous float-parsing method was lacking in a lot of areas. This
commit introduces a state-of-the art implementation that is both
accurate and fast to std.
Code is derived from working repo https://github.com/tiehuis/zig-parsefloat.
This includes more test-cases and performance numbers that are present
in this commit.
The primary testing regime has been using test-data found at
https://github.com/tiehuis/parse-number-fxx-test-data. This is a fork of
upstream with support for f128 test-cases added. This data has been
verified against other independent implementations and represents
accurate round-to-even IEEE-754 floating point semantics.
Compared to the existing parseFloat implementation there is ~5-10x
performance improvement using the above corpus. (f128 parsing excluded
in below measurements).
** Old
** This Implementation
Further performance numbers can be seen using the
https://github.com/tiehuis/simple_fastfloat_benchmark/ repository, which
compares against some other well-known string-to-float conversion
functions. A breakdown can be found here:
https://github.com/tiehuis/zig-parsefloat/blob/0d9f020f1a37ca88bf889703b397c1c41779f090/PERFORMANCE.md#commit-b15406a0d2e18b50a4b62fceb5a6a3bb60ca5706
In summary, we are within 20% of the C++ reference implementation and
have about ~600-700MB/s throughput on a Intel I5-6500 3.5Ghz.
Finally, f128 is now completely supported with full accuracy. This does
use a slower path which is possible to improve in future.
There are a few behavioural changes to note.
parseHexFloat
is now redundant and these are now supported directlyin
parseFloat
.specified by IEEE-754. Previous code used different rounding
mechanisms (standard was round-to-zero, hex-parsing looked to use
round-up) so there may be subtle differences.
Closes #2207.
Fixes #11169.