-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add readonly Capacity property to HashSet<T> and Dictionary<K,V> and TrimExcess to HashSet<T> #66426
Comments
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsBackground and motivationThe complexity of enumerating a If a dict.Remove(key);
if (dict.Count < dict.Capacity >> 2) dict.TrimExcess(); Currently it's not possible to know the internal capacity of these collections, without keeping track of their API Proposalpublic class HashSet<T>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold without resizing.
/// </summary>
public int Capacity { get; }
}
public class Dictionary<TKey, TValue>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold without resizing.
/// </summary>
public int Capacity { get; }
} API UsageAs shown above. Alternative DesignsA public void TrimExcess(double loadFactor); ...but its usage would not be completely obvious, and it could collide with the existing overload that has an RisksNone that I am aware of.
|
The usual concern with these very commonly instantiated generic types is that making them larger grows the generated code many times over in a typical app. Clearly the added code here would be minimal but we would need some evidence that this is a common enough need. |
@danmoseley my actual recent case of needing this API was with a |
@jkotas what would your concern level be about the bytes generated by a simple getter over a field? |
I am ok with it. It has to also check for |
In retrospect adding the The scenario that I want to cover involves a set.Remove(entry);
if (set.Count < set.Capacity >> 2) set.TrimExcess(set.Count << 1); An API that might be an even better solution to my problem, if it existed, could be a |
I'm seeing a couple of options here:
|
This issue has been marked |
@eiriktsarpalis having the |
Updating this one should suffice, thanks. |
@eiriktsarpalis done! |
I found the related issue #27618, where an additional usage scenario for this API was mentioned by @Grauenwolf in a comment: object pooling scenarios. |
We looked at some other types in the namespace that had the same pattern of EnsureCapacity(int)+TrimExcess(void) and squared it off to have both get_Capacity+TrimExcess(int). namespace System.Collections.Generic
{
public partial class HashSet<T>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold
/// without resizing.
/// </summary>
public int Capacity { get; }
/// <summary>
/// Sets the capacity of this set to hold up a specified number of elements
/// without any further expansion of its backing storage.
/// </summary>
public void TrimExcess(int capacity);
}
public partial class Dictionary<TKey, TValue>
{
/// <summary>
/// Gets the total number of elements the internal data structure can hold
/// without resizing.
/// </summary>
public int Capacity { get; }
}
public partial class Queue<T>
{
public int Capacity { get; }
public void TrimExcess(int capacity);
}
public partial class Stack<T>
{
public int Capacity { get; }
public void TrimExcess(int capacity);
}
} |
What about |
Background and motivation
The complexity of enumerating a
HashSet<T>
or aDictionary<TKey, TValue>
is determined not by itsCount
, but by its internal capacity. There are scenarios where both of these structures are enumerated frequently, and during their lifetime they might grow very large and then shrink to a small size. In these scenarios it makes sense to callTrimExcess
at a point when the currentCount
has become too small compared to the current capacity, in order to speed up the enumeration of the collection.If a
Capacity
property was exposed, it could be used for example like this:In the above example the capacity of the dictionary is reduced to double its size, when its size becomes a quarter of its capacity. So a dictionary with capacity 1000 will have its capacity reduced to 500 when the
Count
drops below 250.Currently it's not possible to know the internal capacity of these collections, without keeping track of their
Count
after every operation that affects their size. Which is quite cumbersome. Or by using reflection, which is not efficient and safe.Also currently it's not possible to trim a
HashSet<T>
to a specific capacity, like it's possible with aDictionary<TKey, TValue>
. TheHashSet<T>.TrimExcess
method has no overload with acapacity
parameter.API Proposal
Alternative Designs
A
TrimExcess
overload with aloadFactor
parameter would also do the job:...but its usage would not be completely obvious, and it could collide with the existing overload that has an
int capacity
parameter. Related issue: #23744Risks
None that I am aware of.
The text was updated successfully, but these errors were encountered: