Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Provide optimized read-only collections #67209

Closed
Tracked by #77391
geeknoid opened this issue Mar 27, 2022 · 32 comments · Fixed by #77799
Closed
Tracked by #77391

[API Proposal]: Provide optimized read-only collections #67209

geeknoid opened this issue Mar 27, 2022 · 32 comments · Fixed by #77799
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections
Milestone

Comments

@geeknoid
Copy link
Member

geeknoid commented Mar 27, 2022

EDITED on 10/30/2022 by @stephentoub

Here's what I recommend we do:

namespace System.Collections.Immutable
{
    public static partial class FrozenCollectionExtensions
    {
        public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(
            this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer = null)
            where TKey : notnull;

        public static FrozenSet<T> ToFrozenSet<T>(
            this IEnumerable<T> source, IEqualityComparer<T>? comparer = null);
    }

    public abstract class FrozenDictionary<TKey, TValue> :
        IDictionary<TKey, TValue>, IReadOnlyDictionary<TKey, TValue>, IDictionary
        where TKey : notnull
    {
        internal FrozenDictionary(); // not designed for external extensibility

        public static FrozenDictionary<TKey, TValue> Empty { get; }

        public IEqualityComparer<TKey> Comparer { get; }
        public abstract int Count { get; }

        public abstract ImmutableArray<TKey> Keys { get; }
        public abstract ImmutableArray<TValue> Values { get; }

        public bool ContainsKey(TKey key);
        public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);
        public ref readonly TValue this[TKey key] { get; }
        public abstract ref readonly TValue GetValueRefOrNullRef(TKey key);

        public void CopyTo(KeyValuePair<TKey, TValue>[] destination, int destinationIndex);
        public void CopyTo(Span<KeyValuePair<TKey, TValue>> destination);

        public abstract Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
        {
            public readonly KeyValuePair<TKey, TValue> Current { get; }
            public bool MoveNext();
        }
    }

    public abstract class FrozenSet<T> : ISet<T>, ICollection,
#if NET5_0_OR_GREATER
        IReadOnlySet<T>
#else
        IReadOnlyCollection<T>
#endif
    {
        internal FrozenSet(); // not designed for external extensibility

        public static FrozenSet<T> Empty { get; }

        public IEqualityComparer<T> Comparer { get; }
        public abstract int Count { get; }

        public abstract ImmutableArray<T> Items { get; }

        public bool Contains(T item);

        public void CopyTo(Span<T> destination);
        public void CopyTo(T[] destination, int destinationIndex);

        public abstract bool IsProperSubsetOf(IEnumerable<T> other);
        public abstract bool IsProperSupersetOf(IEnumerable<T> other);
        public abstract bool IsSubsetOf(IEnumerable<T> other);
        public abstract bool IsSupersetOf(IEnumerable<T> other);
        public abstract bool Overlaps(IEnumerable<T> other);
        public abstract bool SetEquals(IEnumerable<T> other);

        public abstract Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<T>
        {
            public readonly T Current { get; }
            public bool MoveNext();
        }
    }
}

A few notes for discussion:

  • Assembly. I've prototyped this to live in System.Collections.Immutable.dll. These types are immutable, differing from the existing immutable collections primarily in that the existing ones are primarily "persistent" and are designed with mutable-like members that return a new instance with the mutation rather than changing the original. Including these in S.C.Immutable.dll has the added bonus that we ship that assembly downlevel, and we'd like for these collections to be widely available.
  • Namespace. I've similarly included the types in the System.Collections.Immutable namespace, but we could choose something else, like System.Collections.Frozen. Note that the public surface area of these types depends on ImmutableArray<T>.
  • Ref-returning Members. One of the design goals here is to enable all means of reading to be efficient; that includes when the values stored in a dictionary are large structs. Thus, as with ReadOnlySpan<T>, the indexer here returns a ref readonly T rather than just a T. There aren't safety concerns here as the underlying storage is immutable. For advanced usage, this also exposes a GetValueRefOrNullRef that mirrors the same method for Dictionary<,> we expose from CollectionsMarshal (but there, we put it on CollectionsMarshal because a growing dictionary can make the ref erroneous, and that's not a problem here).
  • Abstract base types. As with types like Dictionary<,> and HashSet<>, these types are not meant to be extended arbitrarily. However, because they optimize for the exact set of inputs frozen into the collection, it's important to be able to customize the implementation based on the inputs. Thus, the collections are abstract, with a ToFrozenDictionary/Set factory that produces the right internally-derived implementation. Developers consuming the types only concern themselves with the FrozenDictionary<,>/FrozenSet<,> types and the associated extension methods, and the abstractness is just an implementation detail that's required to be public.
  • Extension methods. I've put the two ToFrozenDictionary/Set extension methods on a new FrozenCollectionExtensions type. We could instead introduce two new non-generic FrozenDictionary/Set types, with one extension method each (which is closer to how the existing immutable collections work), or we could put these on to some existing type in the same namespace.
  • Immutable interfaces. I did not make these types implement IImmutableDictionary/Set. While it'd be possible, with the "mutating" methods constructing ImmutableDictionary/Set, it doesn't seem valuable, and also seems potentially misleading.
  • Non-generic interfaces. Dictionary<,> implements IDictionary and SortedSet<> implements ICollection, so I chose to also implement those on FrozenDictionary<,> and FrozenSet<>, respectively.

Background and motivation

There are many scenarios where collections are initialized during startup, or during major state changes in the application, and are not mutated afterwards. Taking advantage of this pattern can lead to substantial improvements in overall access performance of the collections.

Additionally, the overwhelming majority of dictionaries and sets use strings and int as key types. Providing collections that are optimized for these types can also contribute to substantial performance gains relative to the canonical Dictionary<K,V> and HashSet types.

API Proposal

Please see https://github.com/geeknoid/FrozenCollections for an API and implementation which delivers substantial performance gains relative to the built-in .NET collections for long-live read-only collections. The implementation spends extra time during the construction phase of these collections in order to increase lookup performance. The README file shows current performance gains using .NET 6.

The Freezer class provides a number of extension methods which make it a snap to turn an existing dictionary or set into a frozen dictionary or set.

In addition to the freezing functionality, the frozen dictionary API also provides the GetByRef method which makes it possible to get a byref to a value in a frozen dictionary, making dictionaries holding large value types efficient.

API Usage

var d = new Dictionary<string, int>(...);
var frozen = d.Freeze();

Alternative Designs

No response

Risks

No response

@geeknoid geeknoid added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 27, 2022
@ghost
Copy link

ghost commented Mar 28, 2022

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

There are many scenarios where collections are initialized during startup, or during major state changes in the application, and are not mutated afterwards. Taking advantage of this pattern can lead to substantial improvements in overall access performance of the collections.

Additionally, the overwhelming majority of dictionaries and sets use strings and int as key types. Providing collections that are optimized for these types can also contribute to substantial performance gains relative to the canonical Dictionary<K,V> and HashSet types.

API Proposal

Please see https://github.com/geeknoid/FrozenCollections for an API and implementation which delivers substantial performance gains relative to the built-in .NET collections for long-live read-only collections. The implementation spends extra time during the construction phase of these collections in order to increase lookup performance. The README file shows current performance gains using .NET 6.

The Freezer class provides a number of extension methods which make it a snap to turn an existing dictionary or set into a frozen dictionary or set.

In addition to the freezing functionality, the frozen dictionary API also provides the GetByRef method which makes it possible to get a byref to a value in a frozen dictionary, making dictionaries holding large value types efficient.

API Usage

var d = new Dictionary<string, int>(...);
var frozen = d.Freeze();

Alternative Designs

No response

Risks

No response

Author: geeknoid
Assignees: -
Labels:

api-suggestion, area-System.Collections

Milestone: -

@huoyaoyuan
Copy link
Member

How does it compare to builder of immutable collections?

@eiriktsarpalis
Copy link
Member

How does it compare to builder of immutable collections?

Immutable collections are a bit different, since they have been designed with persistent updates in mind. As such lookup in, say, ImmutableDictionary is O(log N), regardless how it was constructed. It would be interesting to see benchmarks comparing System.Collections.Generic with Immutable and optimized read-only collections.

cc @krwq

@eiriktsarpalis eiriktsarpalis added this to the Future milestone Mar 28, 2022
@eiriktsarpalis eiriktsarpalis added api-needs-work API needs work before it is approved, it is NOT ready for implementation wishlist Issue we would like to prioritize, but we can't commit we will get to it yet labels Mar 28, 2022
@GSPP
Copy link

GSPP commented Apr 13, 2022

I find it very common to have read-only dictionaries. This is more than 95% of my usage (I'm sure it varies by codebase). It could be worth creating specialized collections for that. Build once and use many times.

This could provide additional performance due to a number of factors:

  • Free list management can be removed.
  • The version field can be removed.
  • Optimized building from an IEnumerable, list or array. Potentially save on version updates, pre-size better and maybe algorithms are faster for adding if there are no free lists to consider. Maybe a faster bulk creation mechanism can be devised.
  • The collections could be a struct.

Another related change would be to use a simpler mapping function from hash code to bucket. A while ago it was the slow modulo. This was replaced with a faster solution. It could be replaced with power-of-two table sizes and a binary and operation. While this was not deemed appropriate in the "mainstream" collections, it could now be.

Frankly, I'm surprised that the framework internally uses the publicly-exposed general-purpose collections so much. I'd kind of expect that specialized, possibly unsafe collections could pay off in aggregate. But maybe it is smart to dogfood on the public collections.

I've personally written me a "fast" dictionary. I copied the framework code and then optimized it, achieving major gains which payed off in my application under specialized circumstances. This was many years ago, so the framework now contains many of these optimizations and then some. But the basic idea remains useful.

@theodorzoulias
Copy link
Contributor

Frankly, I'm surprised that the framework internally uses the publicly-exposed general-purpose collections so much. I'd kind of expect that specialized, possibly unsafe collections could pay off in aggregate. But maybe it is smart to dogfood on the public collections.

@GSPP AFAIK there is another negative consequence from the standard Dictionary<K,V> being used extensively by the framework internally. It impedes the approval of API proposals for this class, for reasons related with memory bloat. For example for this reason we can't have the extremely handy and efficient Dictionary.GetOrAdd method. Btw the suggested workaround is potentially corruptive, and so it's dangerous to use. Adding a public freezable dictionary class wouldn't solve this problem. The solution would be to avoid dogfooding popular public components, and use internal components instead!

@geeknoid
Copy link
Member Author

@GSPP Please check out the implementation I linked to, which takes advantage of what you suggest.

A benefit I wasn't expecting when starting fiddling with this code is the optimizations that become possible when you know the kind of key you're dealing with, as opposed to relying on fully generic comparers. In particular, when dealing with string keys, my code goes through some effort to reduce the overhead of computing the hash code. Specifically, since I know all the keys up front, I look through the keys to find the shortest possible slice of text across the strings which produces distinct substrings. I do this using "left-aligned" and "right-aligned" comparisons, which helps with strings that have similar prefixes or similar suffixes. Running my tests, I found that in a great many cases, I can cut down the number of characters to hash down to 2 characters per string, rather than the normal mode of hashing the entirety of the strings. This delivers a substantial perf boost at runtime.

@theodorzoulias
Copy link
Contributor

@geeknoid your idea of optimizing the hashcode calculation for strings would be more useful IMHO if it was offered as a custom IEqualityComparer<string> implementation. This would allow it to be used anywhere hashing is required, like in HashSet<T>s, in LINQ GroupBy operations etc. By baking it into the implementation of a specific collection, the opportunities for exploiting this functionality are limited.

@Clockwork-Muse
Copy link
Contributor

custom IEqualityComparer<string> implementation.

Currently equality (and other) comparers are static and maintain no state. You can't implement this sort of optimization that way - at minimum you need to store the minimum length to hash/compare, if not a custom hash collection. Or potentially oddball things like not even using hashing for small collections. Which means:

HashSet<T>

would (potentially) have to invalidate it on every add.... but there's no method to call to do that.

LINQ GroupBy

This is essentially a running collection modification

the opportunities for exploiting this functionality are limited.

It's relatively expensive to calculate the optimization, and it's really only useful when you know the collection isn't going to be modified for the duration of its use....

@geeknoid
Copy link
Member Author

@theodorzoulias As @Clockwork-Muse points out, what enabled me to perform these kinds of comparer-level optimization stems from the fact the collections are frozen and as such we're willing to burn cycles up front doing all this extra processing, in order to make later reads faster. The pattern doesn't really work for mutable collections (at least given how the mutable collections are currently implemented and the extension points they expose). You could certainly create some wrappers for the mutable collections which track all adds/removes and update some state kept in a stateful comparer and that might give you some benefit, but that'd be completely unrelated implementation to what I have here.

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Apr 14, 2022

@geeknoid I was thinking about something like this:

class FastHashStringComparer : IEqualityComparer<string>
{
    public FastHashStringComparer(IEnumerable<string> trainingValues) { }
}

The implementation of this comparer would study the provided trainingValues, and it would decide what slice of the string values provides enough entropy for generating hashcodes of sufficient quality.

I could then use this comparer to create (for example) a lookup for my list of items, like this:

List<Item> items = GetItems();
FastHashStringComparer comparer = new(items.Select(x => x.Category));
ILookup<string, Item> lookup = items.ToLookup(x => x.Category, comparer);

Querying this lookup should be faster compared to querying a lookup backed by the default string comparer.

Or I could use it with a LinkedDictionary<K,V> that I found in a third-party NuGet package, to cover some special need. Lots of options.

@geeknoid
Copy link
Member Author

@theodorzoulias I see, that's a good idea. What you'd need I think is a factory method:

public static IEqualityComparer FastComparer(IEnumerable trainingValues);

With a bit of refactoring, it should be possible to build this on top of the comparers I've already got. There is a bit of incest between my comparers and collections at this point, that would have to be removed so the comparers could stand-alone.

@theodorzoulias
Copy link
Contributor

theodorzoulias commented Apr 15, 2022

@geeknoid in retrospect if someone used the FastComparer with a mutable dictionary, and some of the actual values didn't match the pattern of the trainingValues, the performance could be terrible. And the FastComparer would have no way to do something about it, because it wouldn't know if the GetHashCode is called for a value added or for a value queried. Checking defensively the pattern of every string value passed to the GetHashCode, would defeat the purpose of being fast. So now I am seeing why restricting this functionality to a specific readonly dictionary type makes sense.

@svick
Copy link
Contributor

svick commented Apr 15, 2022

Taking this idea in a slightly different direction, it could also make sense to use perfect hashing for read-only hash tables, meaning you select a hash function based on the known keys (or their minimal prefixes/suffixes) to ensure there are no collisions. That should improve the performance of finding the correct bucket even further.

@geeknoid
Copy link
Member Author

@svick My original intent with my little project was to in fact explore perfect hashing, and I think it still has some potential. However, the dumb brute force model I used ends up leading to a trivial number of collisions for the 99% case, so I stopped there. The dominant cost in the frozen dictionary implementation is in fact the hash and equality functions. So if a perfect hashing function requires doing more work, it can easily offset the savings of never having hash collisions.

Still, it would be worth exploring further.

@stephentoub stephentoub modified the milestones: Future, 8.0.0 Oct 11, 2022
@stephentoub stephentoub self-assigned this Oct 26, 2022
@stephentoub stephentoub added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-needs-work API needs work before it is approved, it is NOT ready for implementation wishlist Issue we would like to prioritize, but we can't commit we will get to it yet api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Oct 26, 2022
@stephentoub
Copy link
Member

I've updated the top post with my recommendation for the APIs we add here, and marked this as ready for review.
#67209

@stephentoub stephentoub added the blocking Marks issues that we want to fast track in order to unblock other important work label Oct 31, 2022
@eiriktsarpalis
Copy link
Member

I've prototyped this to live in System.Collections.Immutable.dll

How does the prototype compare to the regular dictionary/set implementation wrt lookup performance?

@MihaZupan
Copy link
Member

MihaZupan commented Nov 1, 2022

public abstract Enumerator GetEnumerator();
public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
{
    public readonly KeyValuePair<TKey, TValue> Current { get; }
    public bool MoveNext();
}

How does the derived type implement the enumerator in this case? (sorry if this was already discussed)

Edit: Nvm, I see // not designed for external extensibility now.

@stephentoub
Copy link
Member

How does the derived type implement the enumerator in this case?

The Enumerator has an internal constructor that takes a TKey[] and a TValue[].

As noted in the original post, the abstractness in these types is very much not about external extensibility. It's about the factory we implement being able to do work to figure out the optimal concrete type to use.

@stephentoub
Copy link
Member

How does the prototype compare to the regular dictionary/set implementation wrt lookup performance?

Favorably. How favorably depends on a) the exact inputs and b) how extensive we want to be (now and over time) with the various customized implementations that can be used behind the scenes. But, for example, with my current implementation (which will still benefit from further tuning and optimization), and using the switch from dotnet/roslyn#56374 as an example, here's what I get when foreach'ing over a copy of every input in the set (so that raw reference comparison fails) and doing a lookup for it:

Method Mean Error StdDev Ratio
Switch 242.9 ns 4.67 ns 5.56 ns 1.00
Switch_Opt 116.1 ns 2.30 ns 2.15 ns 0.48
HashSetLookup 319.8 ns 6.41 ns 6.29 ns 1.31
FrozenSetLookup 161.0 ns 2.41 ns 2.01 ns 0.66

If I change the benchmark to use the original references so that reference comparison does work, then:

Method Mean Error StdDev Ratio
Switch 245.7 ns 4.91 ns 5.04 ns 1.00
Switch_Opt 117.5 ns 2.27 ns 2.33 ns 0.48
HashSetLookup 252.7 ns 3.61 ns 3.20 ns 1.03
FrozenSetLookup 102.9 ns 1.77 ns 1.48 ns 0.42

I fully expect this can be improved further; the abstract nature here makes it straightforward for us to find common patterns we care about and improve them.

With another benchmark that's creating a set of 1024 random ints and then looking up all of them in a HashSet and a FrozenSet, I get:

Method Mean Error StdDev Ratio
HashSetLookup 7.879 us 0.0530 us 0.0496 us 1.00
FrozenSetLookup 3.478 us 0.0193 us 0.0180 us 0.44

@bartonjs
Copy link
Member

bartonjs commented Nov 1, 2022

  • Based on discussion the abstracts were removed from everything except the type.
  • We split the extension methods into types that were the output type, but non-generic, to facilitate static calls.
  • We decided a new namespace (System.Collections.Frozen) was better than Immutable.
  • We copied in the Enumerable.ToDictionary shapes as overloads of ToFrozenDictionary
  • We added TryGetValue to FrozenSet to match this recent addition to HashSet.
  • There was a question of should FrozenSet's Contains take the parameter as in, and the conclusion was no.
namespace System.Collections.Frozen
{
    public static class FrozenDictionary
    {
        public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(
            this IEnumerable<KeyValuePair<TKey, TValue>> source, IEqualityComparer<TKey>? comparer = null)
            where TKey : notnull;

        public static FrozenDictionary<TKey, TSource> ToFrozenDictionary<TSource, TKey>(
            this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey>? comparer = null)
            where TKey : notnull;

        public static FrozenDictionary<TKey, TElement> ToFrozenDictionary<TSource, TKey, TElement>(
            this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector, IEqualityComparer<TKey>? comparer = null)
            where TKey : notnull;
    }

    public static class FrozenSet
    {
        public static FrozenSet<T> ToFrozenSet<T>(
            this IEnumerable<T> source, IEqualityComparer<T>? comparer = null);
    }

    public abstract class FrozenDictionary<TKey, TValue> :
        IDictionary<TKey, TValue>, IReadOnlyDictionary<TKey, TValue>, IDictionary
        where TKey : notnull
    {
        internal FrozenDictionary(); // not designed for external extensibility

        public static FrozenDictionary<TKey, TValue> Empty { get; }

        public IEqualityComparer<TKey> Comparer { get; }
        public int Count { get; }

        public ImmutableArray<TKey> Keys { get; }
        public ImmutableArray<TValue> Values { get; }

        public bool ContainsKey(TKey key);
        public bool TryGetValue(TKey key, [MaybeNullWhen(false)] out TValue value);

        public ref readonly TValue this[TKey key] { get; }
        public ref readonly TValue GetValueRefOrNullRef(TKey key);

        public void CopyTo(KeyValuePair<TKey, TValue>[] destination, int destinationIndex);
        public void CopyTo(Span<KeyValuePair<TKey, TValue>> destination);

        public Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
        {
            public readonly KeyValuePair<TKey, TValue> Current { get; }
            public bool MoveNext();
        }
    }

    public abstract class FrozenSet<T> : ISet<T>, ICollection,
#if NET5_0_OR_GREATER
        IReadOnlySet<T>
#else
        IReadOnlyCollection<T>
#endif
    {
        internal FrozenSet(); // not designed for external extensibility

        public static FrozenSet<T> Empty { get; }

        public IEqualityComparer<T> Comparer { get; }
        public int Count { get; }

        public ImmutableArray<T> Items { get; }

        public bool Contains(T item);
        public bool TryGetValue(T equalValue, out T actualValue);

        public void CopyTo(Span<T> destination);
        public void CopyTo(T[] destination, int destinationIndex);

        public bool IsProperSubsetOf(IEnumerable<T> other);
        public bool IsProperSupersetOf(IEnumerable<T> other);
        public bool IsSubsetOf(IEnumerable<T> other);
        public bool IsSupersetOf(IEnumerable<T> other);
        public bool Overlaps(IEnumerable<T> other);
        public bool SetEquals(IEnumerable<T> other);

        public Enumerator GetEnumerator();
        public struct Enumerator : IEnumerator<T>
        {
            public readonly T Current { get; }
            public bool MoveNext();
        }
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation blocking Marks issues that we want to fast track in order to unblock other important work labels Nov 1, 2022
@jaredpar
Copy link
Member

jaredpar commented Nov 1, 2022

@bartonjs

Based on discussion the abstracts were removed from everything except the type.

How does the implementor customize the behavior? There is not a protected ctor that takes the elements, none of the members are virtual, ... Feels like something is missing (or maybe my brain isn't working right now).

@stephentoub
Copy link
Member

This is not intended for external extensibility. The only reason the type is abstract is so that our factory methods can provide different concrete implementations based on the supplied input/comparers.

@tfenise
Copy link
Contributor

tfenise commented Nov 2, 2022

public abstract partial class FrozenDictionary<TKey, TValue>
{
    public ImmutableArray<TKey> Keys { get; }
    public ImmutableArray<TValue> Values { get; }
}

public abstract partial class FrozenSet<T>
{
    public ImmutableArray<T> Items { get; }
}

Isn't the return type ImmutableArray<> leaky abstraction? I mean, what if one day an optimized implementation doesn't store the keys, values, or items in a dedicated array?

@stephentoub
Copy link
Member

stephentoub commented Nov 2, 2022

Isn't the return type ImmutableArray<> leaky abstraction?

It's not an abstraction. It's basically saying that a dense, fastest possible, ref-indexable, span-able, etc. mechanism for enumerating each of the keys and values is so important that it's part of the API. If this could return T[] and have it be read-only, it would.

@tfenise
Copy link
Contributor

tfenise commented Nov 2, 2022

It's a set/dictionary, which usually is not designed for indexable access. I'm not sure whether a set/dictionary with the capacity of fast indexable access, or a "pure" set/dictionary with possibly faster and more flexible implementation for other methods is more desirable.

Storing the keys/values/items in a dedicated array could slow down lookups. For example,

private protected override ref readonly TValue GetValueRefOrNullRefCore(TKey key)
{
IEqualityComparer<TKey> comparer = Comparer;
int hashCode = comparer.GetHashCode(key);
_hashTable.FindMatchingEntries(hashCode, out int index, out int endIndex);
while (index <= endIndex)
{
if (hashCode == _hashTable.HashCodes[index])
{
if (comparer.Equals(key, _keys[index]))
{
return ref _values[index];
}
}
index++;
}
return ref Unsafe.NullRef<TValue>();
}

The eventual access to _values[index] could be another cache miss after the possible cache miss of accessing the hash table and _keys[index].

A suggestion from #77799 (comment) is

Dictionaries and sets with low cardinality are common. We should look at adding specialized implementation for 1 and 2 entry collections. This saves memory and speeds things up.

For these specialized implementations, a dedicated array storing the keys/values/items is not really useful.

Besides, if the user really needs fast indexable access, they can create a dedicated array on their own easily, although at the cost of some extra memory.

@stephentoub
Copy link
Member

stephentoub commented Nov 2, 2022

There's no requirement that the type only store the keys/values once. If it proved more valuable in the future in a given implementation to store the values colocated with the hashed data, that could be done in addition to maintaining a separate array (which could be lazily-created/stored only if these properties were accessed, and if they weren't needed by a scenario, they wouldn't be materialized).

@geeknoid, do you want to comment on your needs for having key enumeration and value enumeration and indexing be as fast as possible?

@tfenise
Copy link
Contributor

tfenise commented Nov 2, 2022

(which could be lazily-created/stored only if these properties were accessed, and if they weren't needed by a scenario, they wouldn't be materialized)

This could create performance traps. For example, suppose in the initial version of FrozenDictionary, Keys as an ImmutableArray<> is provided unconditionally and we give the guidance to use FrozenDictionary.Keys whenever to enumerate the keys, even to users who do not really need best enumerating performance. Suppose in the second version of FrozenDictionary, Keys as an ImmutableArray<> is lazily created. Then those users would be paying the unnecessary cost of extra memory or worst-case O(n) time complexity to begin the enumeration, and would need to change their code to enumerating the FrozenDictionary itself to avoid that.

If e.g., 80% of potential users prefer fast lookup to fast indexing while 20% prefer fast indexing to fast lookup, then we probably should not use the explicit return type ImmutableArray<> and thus allow the implementation to be more flexible, although the 20% may have to create their own, extra dedicated array if needed. If 80% prefer fast indexing to fast lookup while 20% prefer fast lookup to fast indexing, then we probably should use the explicit return type ImmutableArray<>, although if one day the implementation is to store the keys/values collocated and maintain the ImmutableArray<> separately, the 20% would have to pay the cost of extra memory/complexity for rather few benefits.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Nov 3, 2022
@geeknoid
Copy link
Member Author

geeknoid commented Nov 4, 2022

@tfenise I think the criteria is different here.

These collections are intended for long-lived state used repeatedly in the life of a service. The general model is that you spend the cycles and the memory up front in order to get the best read-only perf you can. It's understood that it will take longer to create these collections than normal ones, and it will often take more memory too. That's OK, the tradeoff is that reads are faster.

Given this model, every attempt is being made to favor predictable high performance access to the state. The point of these collections is speed after all. I've seen many scenarios in our systems where dictionaries and sets are iterated in some way, and the ImmutableArray is the most efficient way to do this.

The pattern of spending more time during startup to save runtime is pretty fundamental to how you want to write high-scale services. You want to do any kind of setup before you receive any traffic such that you don't experience blips and lower SLAs for the first batch of requests hitting your service. This is why lazily allocated objects in the DI is a problem for example, we ideally want all the things to be initialized and steady before accepting the first request.

If the implementations of the collections change over time, then I think it means that an explicit array of keys and values should be allocated up front during construction in order to satisfy the needs of this API contract, rather than lazily allocating them. Lazy allocation would mean there's an extra conditional in the path, and it means there's a perf blip on first access.

@iSazonov
Copy link
Contributor

iSazonov commented Nov 4, 2022

The pattern of spending more time during startup to save runtime is pretty fundamental to how you want to write high-scale services.

There are applications where fast startup is important like PowerShell (it initializes some large dictionaries at startup).
If all the information is known at compile time I would expect to get for the new read-only collections something like static ReadOnlySpan<byte> s = new byte [] {...} (or fast switch) with almost zero startup time.

@tfenise
Copy link
Contributor

tfenise commented Nov 4, 2022

It's understood that it will take longer to create these collections than normal ones, and it will often take more memory too. That's OK, the tradeoff is that reads are faster.

The slower creation may only slow down the startup, but the memory cost persists throughout the life of the service and might be a bigger concern. Memory cost might also slow down the application due to CPU cache concerns. Besides, these collections may also be consumed by client applications, where startup performance is still important, like iSazonov has mentioned with PowerShell.

In addition, if one is willing to sacrifice memory for faster enumeration, it's always possible and easy to create a dedicated array for the items, but a frozen dictionary/set that consumes less memory or has better lookup performance but provides possibly slower enumeration is more difficult to invent.

I've seen many scenarios in our systems where dictionaries and sets are iterated in some way, and the ImmutableArray is the most efficient way to do this.

An ImmutableArray essentially provides the following optimized access:

  1. Conversion to a ReadOnlySpan.
  2. Fast indexable read.
  3. Fast enumeration.

If the goal is to provide fast enumeration, not fast indexable read or conversion to a ReadOnlySpan, then I think it would be better to return some custom FrozenDictionary.KeysCollection with custom FrozenDictionary.KeysCollection.Enumerator. This allows the implementation to be more flexible and still allows fast enumeration. After all, ImmutableArray achieves fast enumeration with a custom ImmutableArray.Enumerator.

A typical dictionary/set may not provide fast indexable read. If fast indexable read is really needed, it's still solvable with custom indexer FrozenDictionary.KeysCollection.Items.

What restricts the implementation the most is the ability to convert to a ReadOnlySpan. I don't think this is really important for a dictionary/set. Besides, for a dictionary, why ReadOnlySpan<TKey> and ReadOnlySpan<TValue> are more favorable, not ReadOnlySpan<KeyValuePair<TKey, TValue>>? ReadOnlySpan<KeyValuePair<TKey, TValue>> could mean one less range check when both the key and the value are needed.

@geeknoid
Copy link
Member Author

geeknoid commented Nov 4, 2022

The pattern of spending more time during startup to save runtime is pretty fundamental to how you want to write high-scale services.

There are applications where fast startup is important like PowerShell (it initializes some large dictionaries at startup). If all the information is known at compile time I would expect to get for the new read-only collections something like static ReadOnlySpan<byte> s = new byte [] {...} (or fast switch) with almost zero startup time.

I've mentioend somewhere that the idea of serializing the state of the frozen collections would be ideal. You could have a source generator that emits serialized state which is loaded into a frozen collection in no time at startup.

@iSazonov
Copy link
Contributor

iSazonov commented Nov 4, 2022

I've mentioend somewhere that the idea of serializing the state of the frozen collections would be ideal. You could have a source generator that emits serialized state which is loaded into a frozen collection in no time at startup.

We can use SG already today to build custom code like fast switch. With official frozen collections I'd expect to have fast startup out-of-box. Although SG could be the first step in this direction since it requires expanding frozen collections with a special constructor/builder.

By the way, is there any interest/need in implementing SG to do things like gperf but only for the C# world?

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Nov 4, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Dec 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections
Projects
None yet
Development

Successfully merging a pull request may close this issue.