-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
number_converter is slow #422
Comments
Thanks for looking into this. My original focus was on correctness over performance, but with such significant gains possible from small changes, this is probably worth looking more into. I always liked the |
For performance, (at the cost of additional library requirements), you
could look at Boost Qi library - fast text to double/float parsing
and I found this interesting library recently:
https://github.com/ulfjack/ryu
For converting from a float to text (shortest accurate representation).
…On Sun, 17 Nov 2019 at 20:26, Thomas Fussell ***@***.***> wrote:
Thanks for looking into this. My original focus was on correctness over
performance, but with such significant gains possible from small changes,
this is probably worth looking more into. I always liked the nlohmann/json
codebase as they have high quality code and many of the same problems as
xlnt. Looks like their approach for this problem was basically to convert
the decimal separator into a comma depending on the locale and then use
strtod. Here's the PR nlohmann/json#450
<nlohmann/json#450>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#422?email_source=notifications&email_token=AAMSAS64G4LV6INYU2SDDITQUE2A5A5CNFSM4JOH4NYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEIKXDA#issuecomment-554740620>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMSASYZ76ZVSVJWMYKOAHDQUE2A5ANCNFSM4JOH4NYA>
.
|
@tfussell @paulharris The other option that is part of std is C++17's to_chars / from_chars, but only MSVC has implemented so far AFAIK. Hence that would still require preprocessor to detect (luckily, std feature macros are well documented). I will test std::from_chars and see if the trickery is worthwhile, but I don't expect it will be on the same order as dropping istringstream (it could be quite noticeable if we factor in having to scan for comma's though...) |
Pushed the suggested change in #421 as it cleans up quite a hacky section |
Testing MSVC std::from chars dropped another 1-2%, which would indicate to me that you're unlikely to see massive gains as it happens, xlnt has issues with the locale not set to use '.' as the decimal point. There are a handful of locations during parsing where xml::parser::attribute<double> are used, which internally just use stringstream without imbuing the "C" locale, so... |
and fixing those pushed the bench maybe 1-2% the other way using the "de-DE" (german) locale. Should probably add a test so that doesn't break again... (EDIT: test added, test removed, CI doesn't know about loacels, and I can't be bothered right now) |
I think this is fixed now. Going to close it. |
Related to #421
It was noticed during benchmarking that, at least with MSVC 16 (VS 2019), number_converter::stold made a very significant impact on load times (20% throughput was gained by switching to strtod). number_converter internally uses istringstream
strtod will break when the set locale uses ',' as the decimal point, so it isn't an option, but it shows that there is need for an alternative to the current string->double routine. strtod_l is a widely supported extension (POSIX 2002 (?)) that would fit the requirements
The best solution to this would be an external library which either just does locale independent double parsing (note: this is a hard problem, many libs have issues), or wraps all the platform nastiness away. A mediocre solution would be to detect known platforms and #ifdef in the appropriate strtod_l machinery leaving the slow number_converter as the fallback for unknown toolchains (like has been done in the linked PR).
The text was updated successfully, but these errors were encountered: