Improve perf and scalability of Regex's cache #542

stephentoub · 2019-12-05T01:59:56Z

Regex maintains a cache used for the static methods on Regex, e.g. Regex.IsMatch. The cache is implemented as an LRU cache, which maintains a linked list and a dictionary of the cached instances. The linked list maintains the order in which the cached instances were last accessed, making it cheap to expunge older items from the cache. However, that comes at a significant cost: unless the item is the very first one in the linked list, all reads on the cache require taking a global lock, because the linked list needs to be mutated to move the found node to the beginning. That lock has both throughput and scalability implications.

This PR changes the cache from using a Dictionary<> and a linked list to instead using a ConcurrentDictionary<> and a List<>. Rather than making all accesses more expensive in order to make drops less expensive, it makes all reads much cheaper and more scalable, at the expense of making drops more expensive. Since dropping from the cache means we're already paying the expensive cost of creating/parsing/compiling/etc. a new Regex instance, this is a better trade-off, especially since any frequent dropping suggests the consuming app or library needs to revisit its Regex strategy, either using Regex.CacheSize to increase the cache size appropriately, or doing its own caching (e.g. creating the Regex instance it needs and storing it into a field for all future use).

The new scheme uses a ConcurrentDictionary<Key,Node>, a List<Node>, and a fast-path field storing the most recently used Regex instance (just as the existing implementation did). On lookups, if the fast-path field has the matching value, it's just returned. Otherwise, the dictionary is consulted, and if the item is found, the fast-path field is updated. No locking at all is employed, and only a few volatile read/writes are used to update a "last access stamp" that's used to indicate importance if/when items do need to be expunged. On additions, we do still take a global lock and add to the cache. If this puts us over our cache size, we pick an item from the list and remove it. If the list is small, we just examine all of the items looking for the oldest. If the list is larger, we examine a random subset of it; we may not get rid of the absolute oldest item, but it'll be old enough.

cc: @danmosemsft, @eerhardt, @ViktorHofer

Results from running master (old) vs this PR (new) on the RegexCache* tests from the dotnet/performance repo:

Method	Toolchain	total	unique	cacheSize	Mean	Ratio	Gen 0
IsMatch	new	40000	7	0	55.792 ms	0.96	24750.0000
IsMatch	old	40000	7	0	57.454 ms	1.00	25000.0000

IsMatch_Multithreading	new	40000	7	0	31.867 ms	0.98	25000.0000
IsMatch_Multithreading	old	40000	7	0	33.059 ms	1.00	25000.0000

IsMatch	new	40000	1600	15	112.867 ms	0.64	38000.0000
IsMatch	old	40000	1600	15	176.183 ms	1.00	39000.0000

IsMatch_Multithreading	new	40000	1600	15	54.960 ms	0.50	38000.0000
IsMatch_Multithreading	old	40000	1600	15	109.076 ms	1.00	39000.0000

IsMatch	new	40000	1600	800	87.606 ms	0.51	20000.0000
IsMatch	old	40000	1600	800	174.088 ms	1.00	22000.0000

IsMatch_Multithreading	new	40000	1600	800	50.726 ms	0.41	20000.0000
IsMatch_Multithreading	old	40000	1600	800	123.640 ms	1.00	22000.0000

IsMatch	new	40000	1600	3200	13.444 ms	0.94	-
IsMatch	old	40000	1600	3200	14.247 ms	1.00	-

IsMatch_Multithreading	new	40000	1600	3200	5.500 ms	0.42	-
IsMatch_Multithreading	old	40000	1600	3200	13.180 ms	1.00	-

IsMatch	new	400000	1	15	41.607 ms	1.00	-
IsMatch	old	400000	1	15	41.512 ms	1.00	-

IsMatch_Multithreading	new	400000	1	15	40.066 ms	0.90	18000.0000
IsMatch_Multithreading	old	400000	1	15	44.558 ms	1.00	33500.0000

IsMatch	new	400000	7	15	66.953 ms	0.93	-
IsMatch	old	400000	7	15	71.789 ms	1.00	-

IsMatch_Multithreading	new	400000	7	15	46.878 ms	0.52	12000.0000
IsMatch_Multithreading	old	400000	7	15	90.335 ms	1.00	9000.0000

Regex maintains a cache used for the static methods on Regex, e.g. Regex.IsMatch. The cache is implemented as an LRU cache, which maintains a linked list and a dictionary of the cached instances. The linked list maintains the order in which the cached instances were last accessed, making it cheap to expunge older items from the cache. However, that comes at a significant cost: unless the item is the very first one in the linked list, all reads on the cache require taking a global lock, because the linked list needs to be mutated to move the found node to the beginning. That lock has both throughput and scalability implications. This PR changes the cache from using a `Dictionary<>` and a linked list to instead using a `ConcurrentDictionary<>` and a `List<>`. Rather than making all accesses more expensive in order to make drops less expensive, it makes all reads much cheaper and more scalable, at the expense of making drops more expensive. Since dropping from the cache means we're already paying the expensive cost of creating/parsing/compiling/etc. a new Regex instance, this is a better trade-off, especially since any frequent dropping suggests the consuming app or library needs to revisit its Regex strategy, either using Regex.CacheSize to increase the cache size appropriately, or doing its own caching (e.g. creating the Regex instance it needs and storing it into a field for all future use). The new scheme uses a `ConcurrentDictionary<Key,Node>`, a `List<Node>`, and a fast-path field storing the most recently used Regex instance (just as the existing implementation did). On lookups, if the fast-path field has the matching value, it's just returned. Otherwise, the dictionary is consulted, and if the item is found, the fast-path field is updated. No locking at all is employed, and only a few volatile read/writes are used to update a "last access stamp" that's used to indicate importance if/when items do need to be expunged. On additions, we do still take a global lock and add to the cache. If this puts us over our cache size, we pick an item from the list and remove it. If the list is small, we just examine all of the items looking for the oldest. If the list is larger, we examine a random subset of it; we may not get rid of the absolute oldest item, but it'll be old enough.

stephentoub · 2019-12-05T14:30:54Z

The CI failures here are strange; lots of EventSource tests failing with, e.g.

    BasicEventSourceTests.TestsWrite.Test_Write_T_ETW [FAIL]
      Assert.Equal() Failure
                � (pos 0)
      Expected: 
      Actual:   System.Collections.Concurrent.ConcurrentCúúú
                � (pos 0)
      Stack Trace:
        /_/src/libraries/System.Diagnostics.Tracing/tests/BasicEventSourceTest/TestUtilities.cs(50,0): at BasicEventSourceTests.TestUtilities.CheckNoEventSourcesRunning(String message)
        /_/src/libraries/System.Diagnostics.Tracing/tests/BasicEventSourceTest/TestsWrite.cs(456,0): at BasicEventSourceTests.TestsWrite.Test_Write_T(Listener listener)
        /_/src/libraries/System.Diagnostics.Tracing/tests/BasicEventSourceTest/TestsWrite.Etw.cs(29,0): at BasicEventSourceTests.TestsWrite.Test_Write_T_ETW()

My assumption is that a) there was some kind of change in coreclr recently that is now causing a discrepancy with the tests, and b) this is now showing up after @safern's live/live change went in last night, but I'm not sure why I don't see similar failures on other PRs, nor why the "Actual" string above looks corrupted ("System.Collections.Concurrent.ConcurrentCúúú"). Regardless, I put up #565 to add this EventSource to the test's exempted list. @noahfalk, ideas?

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Reference.cs

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs

safern · 2019-12-05T17:59:19Z

but I'm not sure why I don't see similar failures on other PRs.

Does this repro locally with and without your change?

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Cache.cs

eerhardt

This looks good to me. Just some clarifying questions to help me understand.

stephentoub · 2019-12-06T14:48:15Z

Does this repro locally with and without your change?

No. And CI passed now.

noahfalk · 2020-01-03T08:35:00Z

@stephentoub - sorry for a very late reply, I've been on vacation all December and GitHub doesn't have a nice out-of-office feature. I could imagine that your usage of ConcurrentDictionary caused the ConcurrentCollectionsEventSource to get lazily created in a bunch of tests that previously never initialized it, which in turn caused it to get flagged by the test code which is asserting that no unexpected EventSources had been created. Adding it to the exclusion list of known BCL EventSources was the right move. As for why the string was showing up corrupted, that I can't explain. I think its much more likely that it is some issue relating to xunit or the console display given that the string comparison you added at line 39 would only work if eventSource.Name contained the expected string data at that point.

stephentoub · 2020-01-06T14:25:51Z

Thanks, Noah.

danmoseley · 2020-01-06T18:56:10Z

Maybe úúú is meant to be an ellipsis with some special period, and the console codepage is corrupting it.

Dotnet-GitSync-Bot added the area-System.Text.RegularExpressions label Dec 5, 2019

eerhardt reviewed Dec 5, 2019

View reviewed changes

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Reference.cs Show resolved Hide resolved

eerhardt reviewed Dec 5, 2019

View reviewed changes

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs Show resolved Hide resolved

stephentoub closed this Dec 5, 2019

stephentoub reopened this Dec 5, 2019

eerhardt reviewed Dec 6, 2019

View reviewed changes

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Cache.cs Show resolved Hide resolved

eerhardt reviewed Dec 6, 2019

View reviewed changes

src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.Cache.cs Show resolved Hide resolved

eerhardt approved these changes Dec 6, 2019

View reviewed changes

stephentoub merged commit d49fc9e into dotnet:master Dec 6, 2019

stephentoub deleted the alternateregexcaching branch December 6, 2019 14:48

stephentoub mentioned this pull request Jan 7, 2020

Regex optimization opportunities #1349

Closed

41 tasks

stephentoub added the tenet-performance Performance related issue label Jan 12, 2020

stephentoub added this to the 5.0 milestone Jan 12, 2020

josalem mentioned this pull request Jan 16, 2020

[testing] Update TraceEvent package version and bypass corner case in EventPipe #1794

Merged

ghost locked as resolved and limited conversation to collaborators Dec 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve perf and scalability of Regex's cache #542

Improve perf and scalability of Regex's cache #542

stephentoub commented Dec 5, 2019

stephentoub commented Dec 5, 2019

safern commented Dec 5, 2019

eerhardt left a comment

stephentoub commented Dec 6, 2019 •

edited

Loading

noahfalk commented Jan 3, 2020

stephentoub commented Jan 6, 2020

danmoseley commented Jan 6, 2020

Improve perf and scalability of Regex's cache #542

Improve perf and scalability of Regex's cache #542

Conversation

stephentoub commented Dec 5, 2019

stephentoub commented Dec 5, 2019

safern commented Dec 5, 2019

eerhardt left a comment

Choose a reason for hiding this comment

stephentoub commented Dec 6, 2019 • edited Loading

noahfalk commented Jan 3, 2020

stephentoub commented Jan 6, 2020

danmoseley commented Jan 6, 2020

stephentoub commented Dec 6, 2019 •

edited

Loading