Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid clipping overhead when geometry is fully within clipping box #126

Closed
3 tasks
springmeyer opened this issue Jun 25, 2015 · 9 comments
Closed
3 tasks
Milestone

Comments

@springmeyer
Copy link
Contributor

This is an obvious optimization, but was not possible before the new mapnik::geometry structure. Now it should be easy to check the bbox of a geometry (or even a single part of a multigeometry) and skip running it through clipping when not needed.

This should help performance significantly when rendering dense data at low zoom levels (when the majority of the features are not getting clipped even when run through the clipping pathway).

Steps toward this:

  • write a benchmark harness
  • add some sample data that represents cases that would benefit from this optimization and sample data that would not (like clipping large polygons that span the whole earth no matter the zoom level)
  • optimize appropriately for each geometry type
@flippmoke
Copy link
Member

I am worried about the added cost of checking a bounding box prior to clipping being an added time for situations where data does need to be clipped. Currently the extent is not calculated in the 'Integer Tile Domain' so we would have to iterate through all points to calculate the envelope. This would be additional processing that would slow down tiles that are not entirely enveloped by the bounding box of the clip.

We calculate the original data's extent here:
https://github.com/mapbox/mapnik-vector-tile/blob/master/src/vector_tile_processor.ipp#L698

We could perhaps use this extent to limit what geometry is reprojected into the 'Integer Tile Domain' in the case of multipolygons and multi-lines, and geometry collections, however, this runs the risk of throwing out points that might be introduced into the clipping domain based on simplification. Granted this is a small risk, but it might be possible to have one tile missing the segment of a line that was simplified in other geometries. Not certain if this is more then just a concern I can dream up.

@flippmoke
Copy link
Member

Notes after talking with @springmeyer on this on a call:

  • We've got some good testdata we are gathering to build benchmarks.
  • A test specifically of the clipping function should probably be made in C++ in order to full determine where what time is being spent in that process. Benchmarks of the clipping function would give us a better idea of where we could improve performance.
  • If an envelope calculation is done, it may reduce the need for clipping, but this calculation could take as much time as some of the clipping steps, such as the area calculation, could these loops be combined or the processing be optimized
  • Determine properly what steps can be skipped with out clipping, we know that the cleanpolygon method does remove some data from Make CleanPolygon call configurable/disable-able  #131, does this mean this should still be done even if the polygon is completely with in the clipping boundary. There is also the issue of reprojection of data into the integer domain can cause winding order issues among other problems. The union at the end of processing could be important as well for removing bad or invalid data, is this something that should be skipped? These all should probably be answered by tests prior to implementing any changes.

For these reasons and the time required to answer these that this has been removed from the 0.8.5 milestone and is moved to the 0.8.6 milestone.

@flippmoke
Copy link
Member

After working on #135 and thinking more the biggest concerns I have about this change are the following:

  • Currently reprojection into the vector tile domain is known to cause reversing of winding order. This is not due the coordinate projection of the map, only due to rounding. This would still have to be done and checked even if no clipping was required.
  • Area threshold is a parameter currently exposed and it would have to be respected and therefore this calculation would still have to be performed even if the geometry was completely within the bounding box.
  • Currently clean polygon is run, this helps remove invalid geometries from being passed into vector tiles (such as self intersections). The same is true for the union operation which currently helps prevent holes in geometries from making the geometry invalid.

Due to these requirements I am not certain it is a time saver to skip clipping on polygon / polygon parts completely with in the bounds. Currently there is no bounding box calculation in the integer space of the vector tile coordinates therefore, this might not be a worthwild extra step.

@springmeyer springmeyer modified the milestones: v0.8.7, v0.8.6, v0.9.2, v0.9.1 Jul 30, 2015
@springmeyer
Copy link
Contributor Author

Due to these requirements I am not certain it is a time saver to skip clipping on polygon / polygon parts completely with in the bounds.

I'm much less doubtful. I think there is still an obvious optimization here. Code challenges aside my logic is this:

  • Clipping is the most expensive operation of all of the work done during encoding (more expensive than reprojection, rounding to int, simplification, etc)
  • Running complex polygons through the clipper that are fully within the clipping box means that you are going to get basically the same geometry out as you put in (code challenges around area calcs and cleaning aside)
  • Therefore clipping polygons that don't need clipped is extremely wasteful.
  • Therefore not requiring this work is an obvious optimization opportunity.

@springmeyer springmeyer modified the milestones: v0.9.3, v0.9.2, v0.9.4 Aug 11, 2015
@springmeyer
Copy link
Contributor Author

@flippmoke let's revisit this ticket. From a performance standpoint we can gain from this. And from a testing standpoint this has increasing value. So, as we are now close to having tests in place for all the clipper options that can be considered (and called) independently of actual clipping (CleanGeometry, StrictlySimple, and unioning) I think we should revisit moving on this. In my profiling I've consistently seen CleanGeometry and StrictlySimple taking no more than 2-4 % of total time, while clipping often takes > 30-40% of time. When that is happening on geometries fully within the bounds of the tile that is enormously wasteful of CPU. So my thinking is that:

  • If mp union is unneeded we can likely skip clipping overhead completely in these cases and have a code path that just implements CleanGeometry and StrictlySimple on their own. Ideally on mapnik::geometry but even the worse case of having to call AddPath will still not be as slow as wasteful clipping.
  • If the user does absolutely need mp union then we'll need to call AddPath and invoke clipper.Execute with union rather than intersection. The question then will be if this is just as slow as clipping with intersection or if it is significantly less overhead. The answer will help use determine the value of this option and its best default - refs Test polygon decoding and encoding options #165 (comment)

@flippmoke
Copy link
Member

@springmeyer problem here is that strictly_simple and its algorithm are a part of the clipping algorithm.

void SimplifyPolygon(const Path &in_poly, Paths &out_polys, PolyFillType fillType)
{
  Clipper c;
  c.StrictlySimple(true);
  c.AddPath(in_poly, ptSubject, true);
  c.Execute(ctUnion, out_polys, fillType, fillType);
}

@springmeyer
Copy link
Contributor Author

@flippmoke my understanding from talking to you earlier is that you said this is not a blocker because StrictlySimple like CleanGeometry can be called on paths instead of run as a part of Execute as per: http://www.angusj.com/delphi/clipper/documentation/Docs/Units/ClipperLib/Functions/SimplifyPolygons.htm

@springmeyer springmeyer modified the milestones: v1.5.0, v0.9.4 Sep 10, 2017
@springmeyer
Copy link
Contributor Author

Now that clipping is done separately (via mapbox::geometry::wagyu::quick_clip::quick_lr_clip) from geometry cleaning (via mapbox::geometry::wagyu::wagyu::execute(mapbox::geometry::wagyu::clip_type_union) I again think it is worthwhile to revisit this.

We should:

  • Check the bbox of geometries before clipping
  • If they fall completely within the tile extent when we can avoid the overhead of clipping and the copy made of the geometry.
  • And then the unclipped geometries can be added directly into the polygon cleaning/unioning of wagyu::execute

I predict that this could help performance with low zoom tiles with lots of dense data. It of course will not likely help with higher zoom tiles that only interest data that always crosses tile extents.

refs

auto new_ring = mapbox::geometry::wagyu::quick_clip::quick_lr_clip(ring, b);
if (new_ring.empty()) {
if (process_all_rings_) {
continue;
}
return;
}
clipper.add_ring(new_ring);

/cc @artemp

@flippmoke
Copy link
Member

If this is faster, we should be putting this code directly into wagyu and not here. I have doubts about the ability for it to increase performance on the aggregate but created mapbox/wagyu#85 to investigate this at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants