-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Provide optimized read-only collections #67209
Comments
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsBackground and motivationThere are many scenarios where collections are initialized during startup, or during major state changes in the application, and are not mutated afterwards. Taking advantage of this pattern can lead to substantial improvements in overall access performance of the collections. Additionally, the overwhelming majority of dictionaries and sets use strings and int as key types. Providing collections that are optimized for these types can also contribute to substantial performance gains relative to the canonical Dictionary<K,V> and HashSet types. API ProposalPlease see https://github.com/geeknoid/FrozenCollections for an API and implementation which delivers substantial performance gains relative to the built-in .NET collections for long-live read-only collections. The implementation spends extra time during the construction phase of these collections in order to increase lookup performance. The README file shows current performance gains using .NET 6. The Freezer class provides a number of extension methods which make it a snap to turn an existing dictionary or set into a frozen dictionary or set. In addition to the freezing functionality, the frozen dictionary API also provides the GetByRef method which makes it possible to get a byref to a value in a frozen dictionary, making dictionaries holding large value types efficient. API Usagevar d = new Dictionary<string, int>(...);
var frozen = d.Freeze(); Alternative DesignsNo response RisksNo response
|
How does it compare to builder of immutable collections? |
Immutable collections are a bit different, since they have been designed with persistent updates in mind. As such lookup in, say, cc @krwq |
I find it very common to have read-only dictionaries. This is more than 95% of my usage (I'm sure it varies by codebase). It could be worth creating specialized collections for that. Build once and use many times. This could provide additional performance due to a number of factors:
Another related change would be to use a simpler mapping function from hash code to bucket. A while ago it was the slow modulo. This was replaced with a faster solution. It could be replaced with power-of-two table sizes and a binary and operation. While this was not deemed appropriate in the "mainstream" collections, it could now be. Frankly, I'm surprised that the framework internally uses the publicly-exposed general-purpose collections so much. I'd kind of expect that specialized, possibly unsafe collections could pay off in aggregate. But maybe it is smart to dogfood on the public collections. I've personally written me a "fast" dictionary. I copied the framework code and then optimized it, achieving major gains which payed off in my application under specialized circumstances. This was many years ago, so the framework now contains many of these optimizations and then some. But the basic idea remains useful. |
@GSPP AFAIK there is another negative consequence from the standard |
@GSPP Please check out the implementation I linked to, which takes advantage of what you suggest. A benefit I wasn't expecting when starting fiddling with this code is the optimizations that become possible when you know the kind of key you're dealing with, as opposed to relying on fully generic comparers. In particular, when dealing with string keys, my code goes through some effort to reduce the overhead of computing the hash code. Specifically, since I know all the keys up front, I look through the keys to find the shortest possible slice of text across the strings which produces distinct substrings. I do this using "left-aligned" and "right-aligned" comparisons, which helps with strings that have similar prefixes or similar suffixes. Running my tests, I found that in a great many cases, I can cut down the number of characters to hash down to 2 characters per string, rather than the normal mode of hashing the entirety of the strings. This delivers a substantial perf boost at runtime. |
@geeknoid your idea of optimizing the hashcode calculation for strings would be more useful IMHO if it was offered as a custom |
Currently equality (and other) comparers are static and maintain no state. You can't implement this sort of optimization that way - at minimum you need to store the minimum length to hash/compare, if not a custom hash collection. Or potentially oddball things like not even using hashing for small collections. Which means:
would (potentially) have to invalidate it on every add.... but there's no method to call to do that.
This is essentially a running collection modification
It's relatively expensive to calculate the optimization, and it's really only useful when you know the collection isn't going to be modified for the duration of its use.... |
@theodorzoulias As @Clockwork-Muse points out, what enabled me to perform these kinds of comparer-level optimization stems from the fact the collections are frozen and as such we're willing to burn cycles up front doing all this extra processing, in order to make later reads faster. The pattern doesn't really work for mutable collections (at least given how the mutable collections are currently implemented and the extension points they expose). You could certainly create some wrappers for the mutable collections which track all adds/removes and update some state kept in a stateful comparer and that might give you some benefit, but that'd be completely unrelated implementation to what I have here. |
@geeknoid I was thinking about something like this: class FastHashStringComparer : IEqualityComparer<string>
{
public FastHashStringComparer(IEnumerable<string> trainingValues) { }
} The implementation of this comparer would study the provided I could then use this comparer to create (for example) a lookup for my list of items, like this: List<Item> items = GetItems();
FastHashStringComparer comparer = new(items.Select(x => x.Category));
ILookup<string, Item> lookup = items.ToLookup(x => x.Category, comparer); Querying this lookup should be faster compared to querying a lookup backed by the default string comparer. Or I could use it with a |
@theodorzoulias I see, that's a good idea. What you'd need I think is a factory method: public static IEqualityComparer FastComparer(IEnumerable trainingValues); With a bit of refactoring, it should be possible to build this on top of the comparers I've already got. There is a bit of incest between my comparers and collections at this point, that would have to be removed so the comparers could stand-alone. |
@geeknoid in retrospect if someone used the |
Taking this idea in a slightly different direction, it could also make sense to use perfect hashing for read-only hash tables, meaning you select a hash function based on the known keys (or their minimal prefixes/suffixes) to ensure there are no collisions. That should improve the performance of finding the correct bucket even further. |
@svick My original intent with my little project was to in fact explore perfect hashing, and I think it still has some potential. However, the dumb brute force model I used ends up leading to a trivial number of collisions for the 99% case, so I stopped there. The dominant cost in the frozen dictionary implementation is in fact the hash and equality functions. So if a perfect hashing function requires doing more work, it can easily offset the savings of never having hash collisions. Still, it would be worth exploring further. |
I've updated the top post with my recommendation for the APIs we add here, and marked this as ready for review. |
How does the prototype compare to the regular dictionary/set implementation wrt lookup performance? |
public abstract Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
{
public readonly KeyValuePair<TKey, TValue> Current { get; }
public bool MoveNext();
} How does the derived type implement the enumerator in this case? (sorry if this was already discussed) Edit: Nvm, I see |
The Enumerator has an internal constructor that takes a TKey[] and a TValue[]. As noted in the original post, the abstractness in these types is very much not about external extensibility. It's about the factory we implement being able to do work to figure out the optimal concrete type to use. |
Favorably. How favorably depends on a) the exact inputs and b) how extensive we want to be (now and over time) with the various customized implementations that can be used behind the scenes. But, for example, with my current implementation (which will still benefit from further tuning and optimization), and using the
If I change the benchmark to use the original references so that reference comparison does work, then:
I fully expect this can be improved further; the abstract nature here makes it straightforward for us to find common patterns we care about and improve them. With another benchmark that's creating a set of 1024 random ints and then looking up all of them in a HashSet and a FrozenSet, I get:
|
namespace System.Collections.Frozen
{
public static class FrozenDictionary
{
public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(
this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer = null)
where TKey : notnull;
public static FrozenDictionary<TKey, TSource> ToFrozenDictionary<TSource, TKey>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey>? comparer = null)
where TKey : notnull;
public static FrozenDictionary<TKey, TElement> ToFrozenDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector, IEqualityComparer<TKey>? comparer = null)
where TKey : notnull;
}
public static class FrozenSet
{
public static FrozenSet<T> ToFrozenSet<T>(
this IEnumerable<T> source, IEqualityComparer<T>? comparer = null);
}
public abstract class FrozenDictionary<TKey, TValue> :
IDictionary<TKey, TValue>, IReadOnlyDictionary<TKey, TValue>, IDictionary
where TKey : notnull
{
internal FrozenDictionary(); // not designed for external extensibility
public static FrozenDictionary<TKey, TValue> Empty { get; }
public IEqualityComparer<TKey> Comparer { get; }
public int Count { get; }
public ImmutableArray<TKey> Keys { get; }
public ImmutableArray<TValue> Values { get; }
public bool ContainsKey(TKey key);
public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);
public ref readonly TValue this[TKey key] { get; }
public ref readonly TValue GetValueRefOrNullRef(TKey key);
public void CopyTo(KeyValuePair<TKey, TValue>[] destination, int destinationIndex);
public void CopyTo(Span<KeyValuePair<TKey, TValue>> destination);
public Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
{
public readonly KeyValuePair<TKey, TValue> Current { get; }
public bool MoveNext();
}
}
public abstract class FrozenSet<T> : ISet<T>, ICollection,
#if NET5_0_OR_GREATER
IReadOnlySet<T>
#else
IReadOnlyCollection<T>
#endif
{
internal FrozenSet(); // not designed for external extensibility
public static FrozenSet<T> Empty { get; }
public IEqualityComparer<T> Comparer { get; }
public int Count { get; }
public ImmutableArray<T> Items { get; }
public bool Contains(T item);
public bool TryGetValue(T equalValue, out T actualValue);
public void CopyTo(Span<T> destination);
public void CopyTo(T[] destination, int destinationIndex);
public bool IsProperSubsetOf(IEnumerable<T> other);
public bool IsProperSupersetOf(IEnumerable<T> other);
public bool IsSubsetOf(IEnumerable<T> other);
public bool IsSupersetOf(IEnumerable<T> other);
public bool Overlaps(IEnumerable<T> other);
public bool SetEquals(IEnumerable<T> other);
public Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<T>
{
public readonly T Current { get; }
public bool MoveNext();
}
}
} |
How does the implementor customize the behavior? There is not a |
This is not intended for external extensibility. The only reason the type is abstract is so that our factory methods can provide different concrete implementations based on the supplied input/comparers. |
public abstract partial class FrozenDictionary<TKey, TValue>
{
public ImmutableArray<TKey> Keys { get; }
public ImmutableArray<TValue> Values { get; }
}
public abstract partial class FrozenSet<T>
{
public ImmutableArray<T> Items { get; }
} Isn't the return type |
It's not an abstraction. It's basically saying that a dense, fastest possible, ref-indexable, span-able, etc. mechanism for enumerating each of the keys and values is so important that it's part of the API. If this could return T[] and have it be read-only, it would. |
It's a set/dictionary, which usually is not designed for indexable access. I'm not sure whether a set/dictionary with the capacity of fast indexable access, or a "pure" set/dictionary with possibly faster and more flexible implementation for other methods is more desirable. Storing the keys/values/items in a dedicated array could slow down lookups. For example, Lines 21 to 42 in 7a025a6
The eventual access to _values[index] could be another cache miss after the possible cache miss of accessing the hash table and _keys[index] .
A suggestion from #77799 (comment) is
For these specialized implementations, a dedicated array storing the keys/values/items is not really useful. Besides, if the user really needs fast indexable access, they can create a dedicated array on their own easily, although at the cost of some extra memory. |
There's no requirement that the type only store the keys/values once. If it proved more valuable in the future in a given implementation to store the values colocated with the hashed data, that could be done in addition to maintaining a separate array (which could be lazily-created/stored only if these properties were accessed, and if they weren't needed by a scenario, they wouldn't be materialized). @geeknoid, do you want to comment on your needs for having key enumeration and value enumeration and indexing be as fast as possible? |
This could create performance traps. For example, suppose in the initial version of If e.g., 80% of potential users prefer fast lookup to fast indexing while 20% prefer fast indexing to fast lookup, then we probably should not use the explicit return type |
@tfenise I think the criteria is different here. These collections are intended for long-lived state used repeatedly in the life of a service. The general model is that you spend the cycles and the memory up front in order to get the best read-only perf you can. It's understood that it will take longer to create these collections than normal ones, and it will often take more memory too. That's OK, the tradeoff is that reads are faster. Given this model, every attempt is being made to favor predictable high performance access to the state. The point of these collections is speed after all. I've seen many scenarios in our systems where dictionaries and sets are iterated in some way, and the ImmutableArray is the most efficient way to do this. The pattern of spending more time during startup to save runtime is pretty fundamental to how you want to write high-scale services. You want to do any kind of setup before you receive any traffic such that you don't experience blips and lower SLAs for the first batch of requests hitting your service. This is why lazily allocated objects in the DI is a problem for example, we ideally want all the things to be initialized and steady before accepting the first request. If the implementations of the collections change over time, then I think it means that an explicit array of keys and values should be allocated up front during construction in order to satisfy the needs of this API contract, rather than lazily allocating them. Lazy allocation would mean there's an extra conditional in the path, and it means there's a perf blip on first access. |
There are applications where fast startup is important like PowerShell (it initializes some large dictionaries at startup). |
The slower creation may only slow down the startup, but the memory cost persists throughout the life of the service and might be a bigger concern. Memory cost might also slow down the application due to CPU cache concerns. Besides, these collections may also be consumed by client applications, where startup performance is still important, like iSazonov has mentioned with PowerShell. In addition, if one is willing to sacrifice memory for faster enumeration, it's always possible and easy to create a dedicated array for the items, but a frozen dictionary/set that consumes less memory or has better lookup performance but provides possibly slower enumeration is more difficult to invent.
An
If the goal is to provide fast enumeration, not fast indexable read or conversion to a A typical dictionary/set may not provide fast indexable read. If fast indexable read is really needed, it's still solvable with custom indexer What restricts the implementation the most is the ability to convert to a |
I've mentioend somewhere that the idea of serializing the state of the frozen collections would be ideal. You could have a source generator that emits serialized state which is loaded into a frozen collection in no time at startup. |
We can use SG already today to build custom code like By the way, is there any interest/need in implementing SG to do things like gperf but only for the C# world? |
EDITED on 10/30/2022 by @stephentoub
Here's what I recommend we do:
A few notes for discussion:
System.Collections.Immutable.dll
. These types are immutable, differing from the existing immutable collections primarily in that the existing ones are primarily "persistent" and are designed with mutable-like members that return a new instance with the mutation rather than changing the original. Including these in S.C.Immutable.dll has the added bonus that we ship that assembly downlevel, and we'd like for these collections to be widely available.System.Collections.Immutable
namespace, but we could choose something else, likeSystem.Collections.Frozen
. Note that the public surface area of these types depends onImmutableArray<T>
.ReadOnlySpan<T>
, the indexer here returns aref readonly T
rather than just aT
. There aren't safety concerns here as the underlying storage is immutable. For advanced usage, this also exposes aGetValueRefOrNullRef
that mirrors the same method forDictionary<,>
we expose fromCollectionsMarshal
(but there, we put it onCollectionsMarshal
because a growing dictionary can make the ref erroneous, and that's not a problem here).Dictionary<,>
andHashSet<>
, these types are not meant to be extended arbitrarily. However, because they optimize for the exact set of inputs frozen into the collection, it's important to be able to customize the implementation based on the inputs. Thus, the collections are abstract, with aToFrozenDictionary/Set
factory that produces the right internally-derived implementation. Developers consuming the types only concern themselves with theFrozenDictionary<,>
/FrozenSet<,>
types and the associated extension methods, and the abstractness is just an implementation detail that's required to be public.ToFrozenDictionary/Set
extension methods on a newFrozenCollectionExtensions
type. We could instead introduce two new non-genericFrozenDictionary/Set
types, with one extension method each (which is closer to how the existing immutable collections work), or we could put these on to some existing type in the same namespace.IImmutableDictionary/Set
. While it'd be possible, with the "mutating" methods constructingImmutableDictionary/Set
, it doesn't seem valuable, and also seems potentially misleading.Dictionary<,>
implementsIDictionary
andSortedSet<>
implementsICollection
, so I chose to also implement those onFrozenDictionary<,>
andFrozenSet<>
, respectively.Background and motivation
There are many scenarios where collections are initialized during startup, or during major state changes in the application, and are not mutated afterwards. Taking advantage of this pattern can lead to substantial improvements in overall access performance of the collections.
Additionally, the overwhelming majority of dictionaries and sets use strings and int as key types. Providing collections that are optimized for these types can also contribute to substantial performance gains relative to the canonical Dictionary<K,V> and HashSet types.
API Proposal
Please see https://github.com/geeknoid/FrozenCollections for an API and implementation which delivers substantial performance gains relative to the built-in .NET collections for long-live read-only collections. The implementation spends extra time during the construction phase of these collections in order to increase lookup performance. The README file shows current performance gains using .NET 6.
The Freezer class provides a number of extension methods which make it a snap to turn an existing dictionary or set into a frozen dictionary or set.
In addition to the freezing functionality, the frozen dictionary API also provides the GetByRef method which makes it possible to get a byref to a value in a frozen dictionary, making dictionaries holding large value types efficient.
API Usage
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: