Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization optimizations #876

Merged
merged 2 commits into from
Jan 19, 2018
Merged

Serialization optimizations #876

merged 2 commits into from
Jan 19, 2018

Conversation

natebosch
Copy link
Member

Towards #41

  • Avoid nested lists and serializing ints representing indexes (that
    match the index in the outer list already).
  • Avoid repeeating package name strings by storing them in a separate
    list and storing the index into that list.
  • For Maps where we don't care about index order switch to HashMap
  • Use growable: false where possible
  • Avoid .toList() where possible

This reduces filesize for asset_components_example by ~4% and write
time by ~9% and these optimizations also seem to help if we switch to
msgpack.

Towards #41

- Avoid nested lists and serializing ints representing indexes (that
  match the index in the outer list already).
- Avoid repeeating package name strings by storing them in a separate
  list and storing the index into that list.
- For Maps where we don't care about index order switch to HashMap
- Use `growable: false` where possible
- Avoid `.toList()` where possible

This reduces filesize for `asset_components_example` by ~4% and write
time by ~9% and these optimizations also seem to help if we switch to
msgpack.
@natebosch natebosch requested a review from jakemac53 January 19, 2018 18:45
@googlebot googlebot added the cla: yes Google is happy with the PR contributors label Jan 19, 2018
Copy link
Contributor

@jakemac53 jakemac53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some nits, generally LGTM


/// Deserializes an [AssetGraph] from a [Map].
class _AssetGraphDeserializer {
final _idToAssetId = <int, AssetId>{};
final _idToAssetId = new HashMap<int, AssetId>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment about why so this doesn't get reverted (below as well)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for (var descriptor in _serializedGraph['serializedAssetIds']) {
_idToAssetId[descriptor[0] as int] =
new AssetId(descriptor[1] as String, descriptor[2] as String);
var assetPaths = _serializedGraph['assetPaths'] as List;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add a more specific type here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it List<dynamic> but I don't think it buys much. This list is a mix of ints and Strings.

_idToAssetId[descriptor[0] as int] =
new AssetId(descriptor[1] as String, descriptor[2] as String);
var assetPaths = _serializedGraph['assetPaths'] as List;
var assetIdCount = assetPaths.length / 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: round this or do something else to make it an int? (possibly with an assert as well?)

new AssetId(descriptor[1] as String, descriptor[2] as String);
var assetPaths = _serializedGraph['assetPaths'] as List;
var assetIdCount = assetPaths.length / 2;
for (var i = 0; i < assetIdCount; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, in general I think it would be easier to follow if it was i += 2 in the for loop time instead of doing the 2*i and assetPaths.length / 2 stuff, and then everything is always an int as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd have to do _idToAssetid[i ~/ 2] = ... so there is still some strangeness - but I'm fine either way.

_assetIdToId[node.id] = pathId;
pathId++;
assetPaths.add(node.id.path);
assetPaths.add(packages.indexOf(node.id.package));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you try doing an ordered set for packages as well? most likely it wouldn't help though as the number of packages isn't that large but we are doing this lookup a lot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get a fine-grained enough measurement to be sure but for a small enough N I think lists tend to be fastest. LinkedHashSet doesn't have an indexOf operator.

My intuition is that for the number of packages we'll have this is going to be fastest - and this reads the closest to the intention of the code. If we want later to do more benchmarking we can try changing this. I did run the map version and any performance difference washed out in the noise.

@@ -238,11 +238,13 @@ class _WrappedAssetNode extends Object with ListMixin implements List {
case _Field.Id:
return serializer._assetIdToId[node.id];
case _Field.Outputs:
return node.outputs.map((id) => serializer._assetIdToId[id]).toList();
return node.outputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future eliminating these allocations would be good as well, but not for this pr (for primaryOutputs as well)

@natebosch natebosch merged commit bc5a539 into master Jan 19, 2018
@natebosch natebosch deleted the asset-id-lists branch January 19, 2018 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Google is happy with the PR contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants