-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of DeepCollectionEquality().equals(a, b)
is horribly slow
#263
Comments
Disclaimer: I do understand that semantics are a little bit different in that |
Might be something we can optimize, if we can find a safe algorithm. The issue with Not only do we not know that the key/value equality is default /// Equality on maps which assumes that both maps use `keyEquality` for key equality.
///
/// More efficient than [MapEquality] when doing `equals`, because [MapEquality.equals]
/// has to compare the key/value pairs of each map independently of the maps' own
/// equality behavior. This class assumes that compared maps use an equality which is
/// compatible with the `keyEquality`, and can therefore reliy on map lookups
/// for comparing keys from different maps.
class SameMapEquality<K, V> extends MapEquality<K, V> {
SameMapEquality(super.keyEquality, super.valueEquality);
bool equals(Map<K, V> map1, Map<K, V> map2) {
if (map1.length != map2.length) return false;
for (var key in map1.keys) {
var v2 = map2[key];
if (v2 == null && !map2.containsKey(key)) return false;
if (!_valueEquality.equals(map1[key], v2)) return false;
}
return true;
}
} We could make the implementation of For More reasonably, we could special case the default case (default values for both arguments), and optimize that. I think that's the most viable place to start, optimize the default case, if that's what everybody uses anyway. |
I created a (naive) benchmark repo for finding a more optimised version for a deep equality of json-like maps (where we assume the maps are of type <String, dynamic>, use the https://github.com/knaeckeKami/json_equals It uses benchmark_harness to compare the performance of different ways to compare two json-like Maps. It shows that DeepCollectionEquality() has O(n^2) runtime where n is the level of nesting in the map. Here the results of the benchmark: You can see that with a map that is nested 9 levels deep, DeepCollectionEquality is ~4000 times slower than a custom implementation, while with 1 level of nesting it's only ~10 times slower. |
That quadratic complexity should definitely be fixed, I don't know if we can just cache the hash code (probably not, because we still call We should perhaps also add a That seems like a common enough use-case to create an optimized variant for it. E.g., /// Deep equality on JSON-like object structures.
///
/// JSON-like objects can be:
/// * Lists of JSON-like objects.
/// * Maps with `String` keys and JSON-like objects as values.
/// * Other objects compared by normal `==`.
///
/// Lists are considered equal if they have the same length and the elements at the same indices are equal.
/// Maps are considered equal if they have the same string keys, and their values for each key are equal.
/// Such maps are assumed to be normal maps using `==` as key equality.
///
/// This equality treat any value which is not a `List<Object?>` or `Map<String, Object?>`
/// as a plain value to be compared using `==`.
/// Normal JSON structures only contain strings, numbers, booleans and `null` as such values.
class JsonEquality implements Equality<Object?> {
const JsonEquality();
bool equals(Object? v1, Object? v2) {
if (v1 == v2) return true;
if (v1 is List) {
if (v2 is! List) return false;
return const ListEquality<Object?>(JsonEquality()).equals(v1, v2);
}
if (v1 is Map<String, Object?>) {
if (v2 is! Map<String, Object?>) return false;
// Assume normal string equality as map key equality.
var length = v1.length;
if (length != v2.length) return false;
for (var entry in v1) {
if (!equals(entry.value, v2[entry.key])) return false;
}
}
return false;
}
int hash(Object? value) {
if (value is List) return const ListEquality(JsonEquality()).hash(value);
if (value is Map<String, Object?>) {
return const MapEquality(DefaultEquality(), JsonEquality()).hash(value);
}
return value.hashCode;
}
bool isValidKey(_) => true;
} |
Also, thinking about it more, Because, otherwise, are they really equal maps? Being "equal maps" is not just about having the same keys/value pairs, it could also mean that using one instead of the other is a valid substitution. If not, what is "being equal" actually worth? I'd be willing to change the definition of The biggest issue with that change is that it makes |
Makes perfect sense. That's what I was hoping for :) |
@lrhn Yeah, I think we should change things to assume that Alternatively, we could think if there is a way for us to add a fast path here, e.g. we could maybe add something like an interface: abstract class EqualityRule {
bool operator ==(Object other);
}
abstract class ContainerWithEqualityRule {
EqualityRule get equalityRule;
} have underlying implementation classes for
but maybe this is way too baroque. |
I'd be a little more worried about changing For maps, we can check that each key of one map is also a key of the other map, and that the values are equal. For sets, the equal-values clause disappears, and then it's much easier to have supposedly equal sets which are actually different (which again means different hash codes). Let's try and see what happens. |
Even now
Not quite sure I follow here. We already now check that the sets have equal number of elements. One should only consider two sets are equal if This idea that one wants to consider two sets/maps equal (based on |
It would be nice to figure out our path forward with this - I noticed that some internal code also uses |
@lrhn Would you be interested in looking at this? |
I'd love to get back to https://github.com/dart-lang/collection/pull/265/files, but right now I'mtoo busy adding class modifiers, and other 3.0 features, to platform libraries. |
@lrhn friendly ping |
Any updates on this? |
There are customers that use
DeepCollectionEquality().equals(a, b)
wherea
andb
are e.g.Map<String, dynamic>
(let's say json). It turns out this is very slow.See zino-hofmann/graphql-flutter#1196 for example use case suffering from this.
A very simple example that demonstrates the difference between a simple custom implementation and
DeepCollectionEquality().equals()
:In JIT:
In AOT:
So
package:collection
is a hopping 70x slower than a manual version.A big reason is that when comparing two maps for equality, the responsible MapEquality.equals is building a new
HashMap
(of equal size). Building it will require inserting_MapEntry(key, value)
as keys. On insertion of each element into theHashMap
it requires (recursively) calculating a hashcode (and possibly recursively doing equals calls):That's a lot of object allocations, repeatedly recursive hashing & equality checking (possibly on same values).
Other things to note:
MapEquality<K, V>
/ ... classes are generic - but in reality it's used withdynamic
.=> Using generics requires performing
as
-type checks on parameters with covariant types (such asMap<K, V>
)=> Especially considering that the
DeepCollectionEquality
class ignores the reified generics (it considers e.g.<String, dynamic>{}
and<Object, dynamic>
to be equal)HashMap
over{}
may be slower for many cases.MapEquality
/ ... objects instead of creating new ones for every equality operation/cc @lrhn @mraleph
The text was updated successfully, but these errors were encountered: