-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal code and possible loss of precision in Stopwatch.GetElapsedTime(long, long)
#109685
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Stopwatch.GetElapsedTime(long, long)
Stopwatch.GetElapsedTime(long, long)
Tagging subscribers to this area: @dotnet/area-system-runtime |
Your micro-benchmark numbers show improvement of less than 1 nanoseconds. I do not think that less than 1 nanosecond improvement is worth the added complexity for this API. |
I have used this API in tight measurement loops before. Although the gains are <1ns, they are proportionally significant to baseline. While I'm sure it can be different on different processors, isn't What about adding a conditional to check if they are equal and special casing that scenario to just use subtraction? It should be eliminated by the JIT due to the static readonly promotion to const. Very little extra complexity for what could be used as a latency sensitive API? EDIT |
It would not be eliminated for AOT, so the proposed change would be an improvement for JIT and regression for AOT (in some cases at least). |
I should certainly hope not! |
I accidentally measured reciprocal throughput instead of latency. |
If you change Stopwatch.GetElapsedTime, please consider changing TimeProvider.GetElapsedTime as well.
|
Description
Stopwatch.GetElapsedTime
currently uses double-precision floating-point multiplication in order to convert the units of time.runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Stopwatch.cs
Line 27 in 5db0ce0
runtime/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Stopwatch.cs
Lines 133 to 134 in 5db0ce0
This may result in a codegen that looks like this:
The bunch of double-precision floating-point instructions needed for conversion can be replaced with one of the following methods:
TimeSpan.TicksPerSecond
is equal toStopwatch.Frequency
TimeSpan.TicksPerSecond
is an integer multiple ofStopwatch.Frequency
Stopwatch.Frequency
is an integer multiple ofTimeSpan.TicksPerSecond
Stopwatch.Frequency
is not an integer multiple ofTimeSpan.TicksPerSecond
Math.BigMul
, so it might be slower than current method in certain environment.It could improve not only performance, but also precision for long durations (if we're allowed).$$2^{53}$$ (which is about 28.5 years), the lower bits of the ticks are unnecessarily rounded. Even if the rounding doesn't really matter for almost 100% of the applications of this API (because almost nobody wants to measure more than a decade with this API anyway), the performance improvements for trivial cases are worth doing.
Due to the
long
todouble
conversion, if the absolute value of the ticks is greater thanConfiguration
Regression?
Unknown
Data
Benchmark Code
Benchmark Disassembly
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
.NET 9.0.0 (9.0.24.47305), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
Analysis
The text was updated successfully, but these errors were encountered: