-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialization optimizations #876
Conversation
Towards #41 - Avoid nested lists and serializing ints representing indexes (that match the index in the outer list already). - Avoid repeeating package name strings by storing them in a separate list and storing the index into that list. - For Maps where we don't care about index order switch to HashMap - Use `growable: false` where possible - Avoid `.toList()` where possible This reduces filesize for `asset_components_example` by ~4% and write time by ~9% and these optimizations also seem to help if we switch to msgpack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some nits, generally LGTM
|
||
/// Deserializes an [AssetGraph] from a [Map]. | ||
class _AssetGraphDeserializer { | ||
final _idToAssetId = <int, AssetId>{}; | ||
final _idToAssetId = new HashMap<int, AssetId>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment about why so this doesn't get reverted (below as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
for (var descriptor in _serializedGraph['serializedAssetIds']) { | ||
_idToAssetId[descriptor[0] as int] = | ||
new AssetId(descriptor[1] as String, descriptor[2] as String); | ||
var assetPaths = _serializedGraph['assetPaths'] as List; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add a more specific type here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make it List<dynamic>
but I don't think it buys much. This list is a mix of ints and Strings.
_idToAssetId[descriptor[0] as int] = | ||
new AssetId(descriptor[1] as String, descriptor[2] as String); | ||
var assetPaths = _serializedGraph['assetPaths'] as List; | ||
var assetIdCount = assetPaths.length / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: round this or do something else to make it an int? (possibly with an assert as well?)
new AssetId(descriptor[1] as String, descriptor[2] as String); | ||
var assetPaths = _serializedGraph['assetPaths'] as List; | ||
var assetIdCount = assetPaths.length / 2; | ||
for (var i = 0; i < assetIdCount; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, in general I think it would be easier to follow if it was i += 2
in the for loop time instead of doing the 2*i
and assetPaths.length / 2
stuff, and then everything is always an int as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd have to do _idToAssetid[i ~/ 2] = ...
so there is still some strangeness - but I'm fine either way.
_assetIdToId[node.id] = pathId; | ||
pathId++; | ||
assetPaths.add(node.id.path); | ||
assetPaths.add(packages.indexOf(node.id.package)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you try doing an ordered set for packages as well? most likely it wouldn't help though as the number of packages isn't that large but we are doing this lookup a lot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get a fine-grained enough measurement to be sure but for a small enough N I think lists tend to be fastest. LinkedHashSet
doesn't have an indexOf
operator.
My intuition is that for the number of packages we'll have this is going to be fastest - and this reads the closest to the intention of the code. If we want later to do more benchmarking we can try changing this. I did run the map version and any performance difference washed out in the noise.
@@ -238,11 +238,13 @@ class _WrappedAssetNode extends Object with ListMixin implements List { | |||
case _Field.Id: | |||
return serializer._assetIdToId[node.id]; | |||
case _Field.Outputs: | |||
return node.outputs.map((id) => serializer._assetIdToId[id]).toList(); | |||
return node.outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future eliminating these allocations would be good as well, but not for this pr (for primaryOutputs as well)
Towards #41
match the index in the outer list already).
list and storing the index into that list.
growable: false
where possible.toList()
where possibleThis reduces filesize for
asset_components_example
by ~4% and writetime by ~9% and these optimizations also seem to help if we switch to
msgpack.