Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DateTime{Offset} formatting further in a variety of cases #84963

Merged
merged 4 commits into from
Apr 19, 2023

Conversation

stephentoub
Copy link
Member

  • Utf8Formatter special-cases 'G' / the default format. By moving that specialized implementation into DateTimeFormat, we can benefit from it from DateTime{Offset}.ToString/TryFormat as well. This was the last custom formatting routine in Utf8Formatter. As part of deleting two Utf8Formatter.Date.* files, I also changed the TryFormatDateTimeL shim function to go directly to TryFormatR rather than going through the wrapping TryFormat function.

  • The "s" and "u" formats are also reasonably popular and have a fixed pattern that's not sensitive to culture. By writing custom routines for those, we can not only speed them up, but also restructure the calling code to avoid needing the FormatIntoBuilder for some default/G cases.

  • The above requires being able to check whether a provider is invariant, but DateTimeFormatInfo.InvariantInfo and CultureInfo.InvariantCulture.DateTimeFormat actually returned different instances. I made them the same instance, such that we can now just compare against DateTimeFormatInfo.InvariantInfo to handle the 99.9% case of checking whether a DTFI represents the invariant culture. This also make access to DTFI.InvariantInfo faster, as it's now just returning a static readonly field rather than a lazily-initialized volatile field.

  • For "U", we were allocating a new DateTimeFormatInfo and GregorianCalendar (and supporting strings) on every formatting, resulting in ~1K of allocation. I fixed it to only allocate when necessary, which is rare.

  • For formats without a fast path, TryFormat ends up formatting into a ValueListBuilder and then copying from that into the destination span. We can save on a copy by seeding the ValueListBuilder with the destination span itself.

  • Removed some bounds checking in ParseRepeatPattern and various DateTimeFormatInfo.GetXx helpers

  • Removed a % from 'h' handling in FormatCustomized.

  • Plus a few renames and changing DateTimeOffsetPattern to not access a lazily-initialized property on every iteration of the search loop.

  • And Tarek noticed one place we were appending an unchecked char that could have been non-ASCII, so I fixed that to check appropriately. Fixes Some DateTimeTest format tests fail on the upcoming Fedora 39 #84763.

  • Added a bunch of test cases.

private DateTime _dt = DateTime.Now;
private char[] _chars = new char[100];

[Params(null, "r", "s", "u", "U", "G")]
public string Format { get; set; }

[Benchmark] public string DT_ToString() => _dt.ToString(Format);
[Benchmark] public string DT_ToStringInvariant() => _dt.ToString(Format, CultureInfo.InvariantCulture);
[Benchmark] public bool DT_TryFormat() => _dt.TryFormat(_chars, out _, Format);
[Benchmark] public bool DT_TryFormatInvariant() => _dt.TryFormat(_chars, out _, Format, CultureInfo.InvariantCulture);
Method Toolchain Format Mean Ratio Allocated
DT_ToString \main\corerun.exe ? 172.66 ns 1.00 64 B
DT_ToString \pr\corerun.exe ? 160.81 ns 0.89 64 B
DT_ToStringInvariant \main\corerun.exe ? 132.39 ns 1.00 64 B
DT_ToStringInvariant \pr\corerun.exe ? 28.41 ns 0.21 64 B
DT_TryFormat \main\corerun.exe ? 157.31 ns 1.00 -
DT_TryFormat \pr\corerun.exe ? 142.89 ns 0.90 -
DT_TryFormatInvariant \main\corerun.exe ? 116.10 ns 1.00 -
DT_TryFormatInvariant \pr\corerun.exe ? 20.18 ns 0.17 -
DT_ToString \main\corerun.exe G 160.39 ns 1.00 64 B
DT_ToString \pr\corerun.exe G 157.47 ns 0.98 64 B
DT_ToStringInvariant \main\corerun.exe G 128.69 ns 1.00 64 B
DT_ToStringInvariant \pr\corerun.exe G 123.65 ns 0.96 64 B
DT_TryFormat \main\corerun.exe G 152.03 ns 1.00 -
DT_TryFormat \pr\corerun.exe G 139.09 ns 0.92 -
DT_TryFormatInvariant \main\corerun.exe G 113.62 ns 1.00 -
DT_TryFormatInvariant \pr\corerun.exe G 106.79 ns 0.94 -
DT_ToString \main\corerun.exe U 1,354.17 ns 1.00 1280 B
DT_ToString \pr\corerun.exe U 279.74 ns 0.20 96 B
DT_ToStringInvariant \main\corerun.exe U 1,263.04 ns 1.00 1272 B
DT_ToStringInvariant \pr\corerun.exe U 251.18 ns 0.20 88 B
DT_TryFormat \main\corerun.exe U 1,317.94 ns 1.00 1184 B
DT_TryFormat \pr\corerun.exe U 260.80 ns 0.20 -
DT_TryFormatInvariant \main\corerun.exe U 1,208.62 ns 1.00 1184 B
DT_TryFormatInvariant \pr\corerun.exe U 227.11 ns 0.19 -
DT_ToString \main\corerun.exe r 33.33 ns 1.00 80 B
DT_ToString \pr\corerun.exe r 34.75 ns 1.04 80 B
DT_ToStringInvariant \main\corerun.exe r 27.20 ns 1.00 80 B
DT_ToStringInvariant \pr\corerun.exe r 27.21 ns 1.00 80 B
DT_TryFormat \main\corerun.exe r 22.89 ns 1.00 -
DT_TryFormat \pr\corerun.exe r 24.20 ns 1.06 -
DT_TryFormatInvariant \main\corerun.exe r 22.52 ns 1.00 -
DT_TryFormatInvariant \pr\corerun.exe r 23.82 ns 1.06 -
DT_ToString \main\corerun.exe s 142.13 ns 1.00 64 B
DT_ToString \pr\corerun.exe s 24.37 ns 0.17 64 B
DT_ToStringInvariant \main\corerun.exe s 139.41 ns 1.00 64 B
DT_ToStringInvariant \pr\corerun.exe s 24.31 ns 0.17 64 B
DT_TryFormat \main\corerun.exe s 130.19 ns 1.00 -
DT_TryFormat \pr\corerun.exe s 16.07 ns 0.12 -
DT_TryFormatInvariant \main\corerun.exe s 121.87 ns 1.00 -
DT_TryFormatInvariant \pr\corerun.exe s 16.39 ns 0.13 -
DT_ToString \main\corerun.exe u 144.65 ns 1.00 64 B
DT_ToString \pr\corerun.exe u 24.73 ns 0.17 64 B
DT_ToStringInvariant \main\corerun.exe u 139.60 ns 1.00 64 B
DT_ToStringInvariant \pr\corerun.exe u 25.51 ns 0.18 64 B
DT_TryFormat \main\corerun.exe u 131.17 ns 1.00 -
DT_TryFormat \pr\corerun.exe u 15.95 ns 0.12 -
DT_TryFormatInvariant \main\corerun.exe u 124.86 ns 1.00 -
DT_TryFormatInvariant \pr\corerun.exe u 16.96 ns 0.14 -

@stephentoub stephentoub requested a review from tarekgh April 18, 2023 03:30
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 18, 2023
@stephentoub
Copy link
Member Author

Some failures in the new test cases due to timezone disparity... will fix tomorrow...

@tmds
Copy link
Member

tmds commented Apr 18, 2023

@stephentoub when I run this against Fedora 38, I'm seeing two types of failures.

Unexpected values, like:

      System.Tests.DateTimeTests.TryFormat_MatchesExpected(dateTime: 8053-01-08T13:06:42.6940446, format: "U", provider: , expected: "Wednesday, 08 January 8053 18:06:42") [FAIL]
  �[m�[37m                                       ↓ (pos 28)
  �[m�[37m      Expected: ···y, 08 January 8053 18:06:42
  �[m�[37m      Actual:   ···y, 08 January 8053 13:06:42
  �[m�[37m                                       ↑ (pos 28)
  �[m�[30;1m      Stack Trace:
  �[m�[37m        /home/tester/runtime/src/libraries/System.Runtime/tests/System/DateTimeTests.cs(2924,0): at System.Tests.DateTimeTests.TryFormat_MatchesExpected(DateTime dateTime, String format, IFormatProvider provider, String expected)
  �[m�[37m           at InvokeStub_DateTimeTests.TryFormat_MatchesExpected(Object, Object, IntPtr*)
  �[m�[37m           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
      System.Tests.DateTimeTests.TryFormat_MatchesExpected(dateTime: 3274-08-12T22:17:19.8696931, format: "U", provider: , expected: "Monday, 13 August 3274 02:17:19") [FAIL]
  �[m�[31;1m�[m�[37m      Assert.Equal() Failure
  �[m�[37m                ↓ (pos 0)
  �[m�[37m      Expected: Monday, 13 August 3274 02:17:19
  �[m�[37m      Actual:   Sunday, 12 August 3274 22:17:19
  �[m�[37m                ↑ (pos 0)
  �[m�[30;1m      Stack Trace:
  �[m�[37m        /home/tester/runtime/src/libraries/System.Runtime/tests/System/DateTimeTests.cs(2924,0): at System.Tests.DateTimeTests.TryFormat_MatchesExpected(DateTime dateTime, String format, IFormatProvider provider, String expected)
  �[m�[37m           at InvokeStub_DateTimeTests.TryFormat_MatchesExpected(Object, Object, IntPtr*)
  �[m�[37m           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
  �[m�[31;1m�[m�[37m      Assert.Equal() Failure

And also FormatExceptions, like:

 �[m�[37m      ---- System.FormatException : String '' was not recognized as a valid DateTime.
  �[m�[30;1m      Stack Trace:
  �[m�[37m        /home/tester/runtime/src/libraries/System.Runtime/tests/System/DateTimeOffsetTests.cs(984,0): at System.Tests.DateTimeOffsetTests.ParseExact_ToStringThenParseExactRoundtrip_Success(String standardFormat)
      System.Tests.DateTimeOffsetTests.ParseExact_ToStringThenParseExactRoundtrip_Success(standardFormat: "T") [FAIL]
  �[m�[37m           at InvokeStub_DateTimeOffsetTests.ParseExact_ToStringThenParseExactRoundtrip_Success(Object, Object, IntPtr*)
  �[m�[37m           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
  �[m�[37m        ----- Inner Stack Trace -----
  �[m�[37m        /home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/Globalization/DateTimeParse.cs(126,0): at System.DateTimeParse.ParseExactMultiple(ReadOnlySpan`1 s, String[] formats, DateTimeFormatInfo dtfi, DateTimeStyles style, TimeSpan& offset)
  �[m�[37m        /home/tester/runtime/src/libraries/System.Private.CoreLib/src/System/DateTimeOffset.cs(776,0): at System.DateTimeOffset.ParseExact(String input, String[] formats, IFormatProvider formatProvider, DateTimeStyles styles)
  �[m�[37m        /home/tester/runtime/src/libraries/System.Runtime/tests/System/DateTimeOffsetTests.cs(972,0): at System.Tests.DateTimeOffsetTests.ParseExact_ToStringThenParseExactRoundtrip_Success(String standardFormat)

@stephentoub
Copy link
Member Author

@stephentoub when I run this against Fedora 38, I'm seeing two types of failures

@tmds, yes, that's #84963 (comment)

@stephentoub
Copy link
Member Author

And also FormatExceptions, like

That's unexpected. You see those with this PR but not with main? Can you debug what the parser is choking on? This PR really doesn't change the parser, and CI doesn't show such failures.

@tmds
Copy link
Member

tmds commented Apr 18, 2023

You see those with this PR but not with main? Can you debug what the parser is choking on? This PR really doesn't change the parser, and CI doesn't show such failures.

I see those on main also, but only on our CI server. I haven't been able to reproduce them on my machine.
I will look into it further.

@stephentoub
Copy link
Member Author

I see those on main also

Ok, thanks, so not related to this PR.

@tarekgh
Copy link
Member

tarekgh commented Apr 19, 2023

Some failures in the new test cases due to timezone disparity... will fix tomorrow...

This looks correct. I see most of the failures are with the "U" format which will depend on the time zone.

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo fixing the tests, LGTM.

- Utf8Formatter special-cases 'G' / the default format. By moving that specialized implementation into DateTimeFormat, we can benefit from it from DateTime{Offset}.ToString/TryFormat as well.  This was the last custom formatting routine in Utf8Formatter.  As part of deleting two Utf8Formatter.Date.* files, I also changed the TryFormatDateTimeL shim function to go directly to TryFormatR rather than going through the wrapping TryFormat function.

- The "s" and "u" formats are also reasonably popular and have a fixed pattern that's not sensitive to culture.  By writing custom routines for those, we can not only speed them up, but also restructure the calling code to avoid needing the FormatIntoBuilder for some default/G cases.

- The above requires being able to check whether a provider is invariant, but DateTimeFormatInfo.InvariantInfo and CultureInfo.InvariantCulture.DateTimeFormat actually returned different instances.  I made them the same instance, such that we can now just compare against DateTimeFormatInfo.InvariantInfo to handle the 99.9% case of checking whether a DTFI represents the invariant culture.  This also make access to DTFI.InvariantInfo faster, as it's now just returning a static readonly field rather than a lazily-initialized volatile field.

- For "U", we were allocating a new DateTimeFormatInfo and GregorianCalendar (and supporting strings) on every formatting, resulting in ~1K of allocation.  I fixed it to only allocate when necessary, which is rare.

- Removed some bounds checking in ParseRepeatPattern and various DateTimeFormatInfo.GetXx helpers

- Removed a % from 'h' handling in FormatCustomized.

- Plus a few renames and changing DateTimeOffsetPattern to not access a lazily-initialized property on every iteration of the search loop.

- And Tarek noticed one place we were appending an unchecked char that could have been non-ASCII, so I fixed that to check appropriately.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some DateTimeTest format tests fail on the upcoming Fedora 39
5 participants