-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose Adding a GetOrAdd(TKey key, Func<TKey, TValue> valueFactory) to Dictionary<TKey, TValue> #15059
Comments
Suggested at #14676 within the comment thread too. Possible implementation at https://github.com/hackcraft/coreclr/commit/1a982053ae39ee0e418d370fdee3497ecefd0e5a |
That's what I get for thinking of an idea, never getting around to suggesting it. And then remembering months later and forgetting to check again if it had been suggested. Thanks! |
Never mind, I do check for duplicates and I still end up asking something that's been asked before, or submitting a PR that does the same as another. |
I assumed this meant Dictionary.GetOrAdd was added, but I'm not finding it or an IDictionary extension method. If they have been skipped, can this issue be reopened? |
@jnm2 As far as I can tell, @SamuelEnglard closed this issue in favor of https://github.com/dotnet/corefx/issues/1942, which was about So yes, I think it makes sense to repoen this issue. |
Ugh, I can't believe this missed .NET Standard 2.0. @karelz no exceptions? |
Specifically: public static TValue GetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, TValue value)
{
if (dictionary == null) throw new ArgumentNullException(nameof(dictionary));
if (!dictionary.TryGetValue(key, out var r))
dictionary.Add(key, r = value);
return r;
}
public static TValue GetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, Func<TKey, TValue> valueFactory)
{
if (dictionary == null) throw new ArgumentNullException(nameof(dictionary));
if (valueFactory == null) throw new ArgumentNullException(nameof(valueFactory));
if (!dictionary.TryGetValue(key, out var r))
dictionary.Add(key, r = valueFactory.Invoke(key));
return r;
} |
This needs API review. I don't want to rush API reviews. That is just begging for troubles later - you miss things and then you live with your mistakes forever. It is convenience API, we lived without it for quite some time, we can IMO survive one more release. It is not the end of the world after all and not the last .NET Core release. It does not block scenarios. Probably not the answer you expected, but this is how I feel about it. We should IMO focus primarily on the really impactful things. Makes sense? |
It is indeed the answer I expected. 😄 |
(Don't worry about me, I'm not the type that goes around second-guessing the team's decisions. Fyi.) |
Thanks for understanding! |
@jnm2 I don't think it would be a good idea to add @karelz This is not necessarily a convenience API. For if (!dict.ContainsKey(key))
{
dict.Add(key, ...);
} in corefx. |
@jamesqo Yeah, okay. After the BCL has it for a while and we get default interface methods it might make more sense. |
Triage: if the main motivation for this addition is to minimize number of lookups, then presumably adding ref-return methods woulds be the preferable approach? c.f. #22121. Given the lack of recent activity in the issue, I'd be inclined to close this for now. |
@eiriktsarpalis Looking at what you linked, ref-returning methods methods aren't happening except maybe in the future on a CollectionsMarshal class (#27062) as an API that has not yet been proposed. Because of the inherent lack of safety in ref-returning methods due to the backing store changing, I think GetOrAdd is still interesting. I was looking for it just last week. |
@jnm2 just so I understand your requirements, would having the extension method you proposed above solve your issue? IOW is this about double lookups or avoiding creating values when not needed or both? |
@eiriktsarpalis It's about double lookups but also with the expectation that creating a value may be expensive and should be avoidable. Two overloads would be nice, one with |
@jnm2 can you clarify what the "state" would be here? With respect to ref-returns, this is what DictionarySlim does (the k-Neucleotide benchmark it was used for needs to increment values held in a map). As you point out, that makes it easy to corrupt the backing store. I wonder whether the "update the value" scenario is interesting. If it is, a possible middle ground might be for the cc @GrabYourPitchforks since we were talking about it. |
@danmosemsft I was thinking of TState in the same sense as ThreadPool.QueueUserWorkItem. It would be a way to avoid allocating a delegate per call when the key alone does not provide enough information to create the value. However, I don't often have a practical need to avoid allocations like these, so this could wait until someone does have the need. public static TValue GetOrAdd<TKey, TValue, TState>(
this IDictionary<TKey, TValue> dictionary,
TKey key,
TState state,
Func<TState, TValue> valueFactory); |
Carrying over from #36942, we should consider evaluating an |
@eiriktsarpalis This is marked as needs-work, but what is left to address for this? I see the final API as falling under one of two cases: public class Dictionary<TKey, TValue>
{
public TValue GetOrAdd([NotNull] TKey key, Func<TKey, TValue> valueFactory);
}
/* OR */
public class Dictionary<TKey, TValue>
{
public TValue GetOrAdd<TState>([NotNull] TKey key, TState state, Func<TKey, TState, TValue> valueFactory);
} Passing a TState parameter might be more useful for the majority of scenarios (is the key by itself sufficient to create the value?), but introducing a generic method on a generic type could introduce codegen explosion. @jkotas might be able to speak as to whether that's a deal-breaker. If it is a deal-breaker, we might be able to simplify by saying "state is always provided as object?; no generic state parameter is available." |
Yes, I am not trilled about adding more features to Dictionary. Every new feature in Dictionary makes all existing Dictionary uses more expensive given how generics work, especially in AOT scenarios. I do not see a significant usability advantage of this method over writing a simple straightforward code:
This straightforward code is also going to have better performance in number of cases. It avoids overhead of creating and passing around the lamda or state. I understand that there are situations where it is not the case (e.g. the hash code computation is very expensive or it is rare for TryGetValue to find the element). I think #27062 is better answer for cases that needs a method like this for performance micro-optimizations. |
public static class CollectionsMarshal
{
public static ref TValue GetValueOrAddDefault<TKey, TValue>(Dictionary<TKey, TValue> @this, [NotNull] TKey key, out bool exists);
} Then a "compute the hash code once" code sample would look like: ref TValue refToMyValue = CollectionsMarshal.GetValueOrAddDefault(dict, "myKey", out bool exists);
if (!exists)
{
// the dictionary just performed the equivalent of Add("myKey", default(TValue)) - let's overwrite it now
refToMyValue = CreateValue();
} As Jan said, this would avoid the generic explosion problem, and it moves the micro-perf API to its own type. The downside is that |
Keep in mind it's not really about simplifying from two lines of code to one, it's about changing this code: Foo GetFoo(int id)
{
if (!dictionary.TryGetValue(id, out var foo))
{
foo = CreateFoo(id);
dictionary.Add(id, foo);
}
return foo;
} to this code: Foo GetFoo(int id) => dictionary.GetOrAdd(id, CreateFoo); The latter seems much easier to understand to me. |
As mentioned, one of the problems with providing the factory is "is the key by itself sufficient to create the value, or does creation of the value rely on more information than just the key?" If the latter (which I believe is common), you really need to provide a state object. |
And that easily gets convoluted: if (!dictionary.TryGetValue(id, out var foo))
dictionary.Add(id, foo = CreateFoo(id));
Use(foo); var foo = dictionary.GetOrAdd(id, static t => t.self.CreateFoo(t.id), (self: this, id))
Use(foo); I'd personally prefer the first one. |
Wouldn't passing the state as a second parameter be way better, like ConcurrentDictionary? var dict = new ConcurrentDictionary<ulong, Client>();
// ...
var clientSocket = await serverSocket.AcceptAsync();
var client = dict.GetOrAdd(clientId, CreateClient, socket);
HandleClient(client);
// ...
static Client CreateClient(ulong clientId, Socket clientSocket)
{
return new Client(clientSocket);
} |
@FiniteReality We need to be careful about adding generic instance methods to generic types. There's a real risk that it could cause significant footprint or memory bloat, even for people who don't use the new feature. Generally we want new features to be pay-for-play and to incur no overhead if you don't use them. |
@GrabYourPitchforks Understandable, but doesn't that somewhat fly in the face of being able to supply a state object? From what I can tell, the API @alrz suggested would have had the same issues, but would have been even more awkward to interact with. |
Yes, it does fly in the face of being able to provide a state object, unless the parameter is typed as object? or some similar non-generic type. |
Oh joy! That's one of the things I hate most about using System.Threading.Timer 😅 |
I don't understand this claim. Could you elaborate? |
Agreed that a |
I'm closing this issue in favor of #27062 which offers a more capable solution. See that issue for more details. |
FWIW here's how one can author the originally proposed method as an extension method in .NET 6: public static class CollectionExtensions
{
public static TValue GetOrAdd<TKey, TValue>(this Dictionary<TKey, TValue> dictionary, TKey key, Func<TKey, TValue> valueFactory)
where TKey : notnull
{
if (dictionary == null)
{
throw new ArgumentNullException(nameof(dictionary));
}
if (valueFactory == null)
{
throw new ArgumentNullException(nameof(valueFactory));
}
ref TValue? value = ref CollectionsMarshal.GetValueRefOrAddDefault(dictionary, key, out bool exists);
if (!exists)
{
value = valueFactory(key);
}
return value!;
}
} |
On
ConcurrentDictionary<TKey, TValue>
there is the very useful functionTValue GetOrAdd(TKey key, Func<TKey, TValie> valueFactory)
. The pattern that the method represents is one that issued commonly on the "normal"Dictionary<TKey, TValue>
as well. I propose adding such a method for this common use case. Besides simplifying it will also make switching between the two dictionaries simpler.The text was updated successfully, but these errors were encountered: