Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - share workers across Map instance with a global worker pool #2917

Closed
wants to merge 9 commits into from

Conversation

anandthakker
Copy link
Contributor

Refs #899

Before:
without-pool

After:
with-pool

Anand Thakker added 4 commits July 27, 2016 17:36
Instead of having a mock Dispatcher for node, pull out the relevant code
into a small wrapper, web_worker.js, which contains the node/browser
code split to a smaller scope.
This prepares for workers being shared across map instances.
@@ -24,7 +24,7 @@ function WorkerTile(params) {
}

WorkerTile.prototype.parse = function(data, layerFamilies, actor, rawTileData, callback) {

console.log('PARSING TILE', this.source, this.coord.z + '/' + this.coord.x + '/' + this.coord.y, this.uid)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For demo purposes only ^

} else {
tile.workerID = this.dispatcher.send('load tile', params, done.bind(this));
this.dispatcher.send('load tile', params, done.bind(this), tile.uid);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the change to Dispatcher (using the new WorkerPool), wherein to send multiple requests to the same worker, dispatcher.send() accepts a 'key' (tile.uid here) and guarantees that two requests sent with the same key go to the same worker.


if (this.layers[key]) return;

var styleLayers = this.layers[key] = {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this change, this.layers and this.layerFamilies gain a layer of indirection, becoming maps from a 'style key' to the style's set of layers and layer families. This allows a Worker to track multiple map instances' styles -- but by keying them off of a 'style key' rather than a map instance id, we allow for two map instances with the same style to share work.

@anandthakker
Copy link
Contributor Author

@lucaswoj added some comments and notes in the diff -- I think this is ready for a first 👀 to validate / modify / reject the approach.

@mollymerp
Copy link
Contributor

wow! thank you @anandthakker! at first glance -- and those gifs!! -- this is really impressive.

@anandthakker
Copy link
Contributor Author

Thanks @mollymerp !

Let me know if y'all think this approach is viable (generally speaking), and if so, I'll take another pass to do some cleanup, benchmarking, and unit tests.

@lucaswoj lucaswoj added this to the Ayacucho milestone Aug 2, 2016
// We use a 'worker key' that's tied to the current geojson data for
// this source, so that when we ask the worker for tiles, the request
// goes to the same worker that parsed/prepared the geojson.
var workerKey = options.url || options.data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we reference this workerKey against a map. Will this work of options.data is an object?

var object = {};

object[{foo: true}] = 'foo';
object[{bar: true}] = 'bar';

object[{foo: true}] // => 'bar';
object['[Object object]'] // => 'bar'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the line 138 above ensures that this won't happen -- but now that I'm looking at this again, I do think it's not especially clear; I'll make it clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I forgot that we stringified it. LGTM 👍

@lucaswoj
Copy link
Contributor

lucaswoj commented Aug 4, 2016

Before I delve into a line-by-line review here, may I ask a few questions about this PR?

  • How is the global worker pool architected?
  • Could we get a similar effect by deduplicating SourceCaches across maps in the main thread? (I recognize this may require some major changes to SourceCache Refactor "SourceCache" #2291)
  • If a user calls GeoJSONSource#setData, will all maps change?
  • If two maps use the same vector tile, how to you avoid double-transferring the StructArrays?
  • How does this handle the case where two maps use the same vector tile but have different styles (which require different StructArray layouts)?

@lucaswoj lucaswoj removed this from the Ayacucho milestone Aug 4, 2016
parentListeners.splice(0, parentListeners.length);
workerListeners.splice(0, workerListeners.length);
};
this.id = util.uniqueId();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: this id needs to be unique across map instances, but util.uniqueId() isn't.

@anandthakker
Copy link
Contributor Author

If two maps use the same vector tile, how to you avoid double-transferring the StructArrays?

@lucaswoj SHOOT, yes, I completely missed this, and I think it might be a dealbreaker for this approach.

(Here, in trying to handle the race condition of one instance calling load tile while a load is already in flight, I mistakenly dropped the transferables from the callback args, so I never tripped the double-transfer)

Maybe handling this will be best done by scrapping this approach and starting with something like this suggestion:

Could we get a similar effect by deduplicating SourceCaches across maps in the main thread? (I recognize this may require some major changes to SourceCache #2291)

Closing this, but I'll dump a summary of the approach I started here, in case there are any parts of it that might be useful salvage or discussion:

WorkerPool

Each Map instance has a single Dispatcher instance. Workers are provided to the dispatcher via WorkerPool, which is a per-script (and thus almost always per-page) singleton.

Like before, dispatchers access workers by way of Actors, which are what the WorkerPool actually provides to it. I.e.:

                      +-------+    +--------+    +-------+
                   +--+ ACTOR +----+ WORKER +----+ ACTOR +--+
                   |  +-------+    +--------+    +-------+  |
                   |                                        |
                   |  +-------+    +--------+    +-------+  |
+------------+     +--+ ACTOR +----+ WORKER +----+ ACTOR +--+     +------------+
|            |     |  +-------+    +--------+    +-------+  |     |            |
| DISPATCHER +-----+                                        +-----+ DISPATCHER |
|            |     |  +-------+    +--------+    +-------+  |     |            |
+------------+     +--+ ACTOR +----+ WORKER +----+ ACTOR +--+     +------------+
(main thread)      |  +-------+    +--------+    +-------+  |     (main thread)
                   |                                        |
                   |  +-------+    +--------+    +-------+  |
                   +--+ ACTOR +----+ WORKER +----+ ACTOR +--+
                      +-------+    +--------+    +-------+

(This is a logical diagram -- in reality, each ACTOR above is really two actors--one on the main thread and one on the worker thread.)

Actor and Worker

Actor remains mostly the same; only change here is that it accepts a parentId identifying the Dispatcher (and thus Map) it belongs to, and that it passes this parentId along as part of the messages it sends. (The parentId for the worker-side Actor is just null.)

Changes to Worker are more involved:

  • All target tasks ('load tile', 'set style', etc.) now take a mapId
    parameter. (This is the parentId of the calling Actor; the nomenclature
    here stinks, I know.)
  • set style and update style now take care of keeping track of the style
    layers (and layer families) for multiple Maps. The most straightforward
    way to do this would have been simply to maintain an object hash from
    mapId to the layers and layerFamilies data for that map. However,
    this would fail to improve the redundant VT parsing that was the original
    goal, because VT parsing depends on this style information.
  • Therefore, there's extra layer of indirection:
    • (a) The worker tracks a mapping from mapId to a "style key"; a style key
      corresponds uniquely to a given map style state, but may be shared in
      common among multiple map instances.
    • (b) layers and layerFamilies are tracked by style key.
  • VectorTileWorkerSource now tracks loading/loaded tiles with [(style key)+(source id)][uid] rather than just [source id][uid].

Smaller Changes

  • 2cc58b4
    this just moves the point of bifurcation between node and browser to a
    smaller web_worker.js wrapper, so that dispatcher.js can be a single
    file.
  • https://github.com/mapbox/mapbox-gl-js/pull/2917/files#r72672765 - instead
    of returning a workerID and then using that to retarget the same worker
    that loaded a tile, we use a scheme where the tile's coordinates determine
    its uid, and its uid determines which worker does the work. This lets us
    ensure that two map instances attempt to use the same worker to to load a
    given tile.

Other Questions

If a user calls GeoJSONSource#setData, will all maps change?

Right now, I think that's right: GeoJSONSource is still TBD.

How does this handle the case where two maps use the same vector tile but have different styles (which require different StructArray layouts)?

This, I think, is handled by the "style key" system above.

@mollymerp
Copy link
Contributor

Aw shucks 😢. Thanks for your work on this nonetheless!

@jfirebaugh jfirebaugh deleted the global-worker-pool branch February 3, 2017 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants