Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize QuantizedMeshTerrainData.interpolateHeight #7508

Merged
merged 6 commits into from
Jan 25, 2019

Conversation

shehzan10
Copy link
Member

QuantizedMeshTerrainData.interpolateHeight iterates through all triangles in the tile to find the triangle which contains the point.

To do this, the function was computing barycentric coordinates for each and every triangle, which is a lot of computation when done for the entire mesh.

This optimization introduces a check for point in triangle using ray casting algorithm that has early exit conditions, thus the amount of floating-point operations is reduced significantly.

I've seen speed ups of upto 5x when using interpolateHeight. This will also speed up sampleTerrain, sampleTerrainMostDetailed and other clamp to ground functions.

@cesium-concierge
Copy link

cesium-concierge commented Jan 24, 2019

Thanks for the pull request @shehzan10!

  • ✔️ Signed CLA found.
  • CHANGES.md was not updated.
    • If this change updates the public API in any way, please add a bullet point to CHANGES.md.

Reviewers, don't forget to make sure that:

  • Cesium Viewer works.
  • Works in 2D/CV.
  • Works (or fails gracefully) in IE11.

@shehzan10 shehzan10 force-pushed the optimize-interpolateHeight branch from 067113c to 7b4b727 Compare January 24, 2019 16:26
@shehzan10
Copy link
Member Author

@likangning93 Do you want to do the initial review?

@shehzan10
Copy link
Member Author

shehzan10 commented Jan 24, 2019

Just as a record, I've tried further "enhancing" the early exit conditions, but these have all been slower than the committed code by about 10% (probably because of the extra assignment?).

An example is:

function isPointInsideUVTriangle(u, v, u0, v0, u1, v1, u2, v2) {
    var inside = false;
    var intersect = false;

    var checkV01 = ((v0 > v) !== (v1 > v));
    var checkV12 = ((v1 > v) !== (v2 > v));
    var checkV20 = ((v2 > v) !== (v0 > v));

    if (checkV01) {
        intersect = (u < (u1 - u0) * (v - v0) / (v1 - v0) + u0);
        if (intersect) {
            inside = !inside;
        }
    }

    if (checkV12) {
        intersect = (u < (u2 - u1) * (v - v1) / (v2 - v1) + u1);
        if (intersect) {
             inside = !inside;
        }
    }

    if (checkV20) {
        intersect = (u < (u0 - u2) * (v - v2) / (v0 - v2) + u2);
        if (intersect) {
            inside = !inside;
        }
    }

    return inside;
}

@likangning93 likangning93 self-requested a review January 24, 2019 18:05
@likangning93
Copy link
Contributor

likangning93 commented Jan 24, 2019

I think I get it! For me, the intuition was as follows:

The formulas that look like this:

u < (u1 - u0) * (v - v0) / (v1 - v0) + u0

can be derived from:

(u - u0) / (u1 - u0) < (v - v0) / (v1 - v0)

This looks kind of like how you'd compute a parametric description of the 0 -> 1 line.
Taking the ((v0 > v) !== (v1 > v)) term to guarantee that v0 < v < v1, we can draw something like:

_______________________________________________1   _
  Area where expression is true.              /    |
  Note that (u - u0) / dU === (v - v0) / dV  /     dV
  along the diagonal line                   /      |
-------------------------------------------0      ---
                                           |-dU-|

Since we assume consistent winding order, the 1 -> 2 and 2 -> 0 lines should look similar, just rotated counterclockwise so the areas where the expression is true overlap to form the triangle.

@likangning93
Copy link
Contributor

The one question I have is whether or not this is faster in minified Cesium. Most references I've seen, like http://blackpawn.com/texts/pointinpoly/, claim that the barycentric method is quite fast, so I'm wondering if maybe our implementation seems slower because of the extra overhead from all the defined checks and whatnot in Intersections2D.computeBarycentricCoordinates.

@likangning93
Copy link
Contributor

I also haven't convinced myself yet that we don't run the risk of algorithm differences causing points to pass this method but then produce NaNs and infinities when computing barycentric coordinates. It feels safer to use a single method if that's viable.

@mramato
Copy link
Contributor

mramato commented Jan 24, 2019

The one question I have is whether or not this is faster in minified Cesium. Most references I've seen, like http://blackpawn.com/texts/pointinpoly/, claim that the barycentric method is quite fast, so I'm wondering if maybe our implementation seems slower because of the extra overhead from all the defined checks and whatnot in Intersections2D.computeBarycentricCoordinates.

This is an excellent point, @shehzan10 please profile against minified Cesium as well. if defined is overhead in this case, we might want to re-org some things to avoid is (since Node.js applications do not use the minified for example).

@mramato
Copy link
Contributor

mramato commented Jan 24, 2019

To add to @likangning93's question, I think it's time we figure out how to allow for the minified Cesium.js in node apps because I think it will benefit performance quite a bit.

@shehzan10
Copy link
Member Author

shehzan10 commented Jan 24, 2019

The one question I have is whether or not this is faster in minified Cesium. Most references I've seen, like http://blackpawn.com/texts/pointinpoly, claim that the barycentric method is quite fast, so I'm wondering if maybe our implementation seems slower because of the extra overhead from all the defined checks and whatnot in Intersections2D.computeBarycentricCoordinates.

Instead of running with minified Cesium (since I'm testing with a node script), I commented out the checks in computeBarycentricCoordinates. This did speed up the code by ~2x, but not close enough to the ~5x we're getting with point in triangle check. Here's my benchmark:

master: 10250ms
commented out barycentric coordinates checks: 4650ms
commented out barycentric coordinates checks with this optimization: 1920ms
only this optimization (barycentric coordinates checks are done): 1950ms

While there is a significant improvement in minified version, the gains from this optimization is obvious.

Edit: Just wanted to talk a bit more about the benchmark. This benchmark was done on CWT near Grand Canyon Village (tile 14/6176/11474). I constructed a grid of 256x256 points within the tile and the sampled those for heights. So it covers best, average and worst case points.

@shehzan10
Copy link
Member Author

I also haven't convinced myself yet that we don't run the risk of algorithm differences causing points to pass this method but then produce NaNs and infinities when computing barycentric coordinates. It feels safer to use a single method if that's viable.

This is an established algorithm for checking if a point is in a polygon, simplified for triangles. If it helps, here is another library doing a more general version of this algorithm https://github.com/substack/point-in-polygon.

@shehzan10
Copy link
Member Author

Just found the correct algorithm link: https://wrf.ecse.rpi.edu//Research/Short_Notes/pnpoly.html

@mramato
Copy link
Contributor

mramato commented Jan 24, 2019

While there is a significant improvement in minified version, the gains from this optimization is obvious.

I'm going to open a separate PR to enable use of minified version so that Node apps can take advantage of the performance improvement it brings.

@@ -74,7 +74,7 @@ defineSuite([
});

it('works for a dodgy point right near the edge of a tile', function() {
var positions = [new Cartographic(0.33179290856829535, 0.7363107781851078)];
var positions = [new Cartographic(0.33179290856829535, 0.7363107781851076)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slight change in this spec is kind of scary. Does the spec fail without the change? Do we know why the previous implementation needed this spec to pass as-is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous implementation was added here: 22ff240.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this spec fail if the value is reverted? The linked commit seems to suggest that it was important for this spec to pass with the given value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the clamp behavior is preserved in the current changes too, which may suggest that there are some positions that were inside the triangle as-per the barycentric method but not according to the raycasting method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @likangning93, changing this spec here is the wrong thing to do. It was put here specifically because we were seeing corner cases fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a thought: can we make the raycast check more permissive than the barycentric method but still use the barycentric method for points that pass the raycast check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran a small test and it looks like the triangle isn't checking for points on the triangle. I'm going to look into how we can include this.

Copy link
Member Author

@shehzan10 shehzan10 Jan 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@likangning93 I reverted the spec change and checked for points on the triangle. With that change the benchmark changed from 1920ms to 2500ms (ouch). Still 4x faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a thought: can we make the raycast check more permissive than the barycentric method but still use the barycentric method for points that pass the raycast check?

I don't think it would hurt, because technically it will only be done for one triangle. But then again, it is redundant. The check is stringent enough to not allow points outside the triangle from getting int.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's that redundant, it's similar to using a BVH.
The problem is that the raycast-based check has different tolerances than the barycentric check, but the barycentric check has been calibrated for corner cases that we'd need to re-evaluate.
Making the raycast check slightly more permissive would eliminate the urgency around those corner cases because they'd still be handled by the barycentric check.

@@ -93,7 +93,7 @@ defineSuite([
});

it('works for a dodgy point right near the edge of a tile', function() {
var positions = [new Cartographic(0.33179290856829535, 0.7363107781851078)];
var positions = [new Cartographic(0.33179290856829535, 0.7363107781851076)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@shehzan10
Copy link
Member Author

We may have to give up on this altogether. The algorithm is quite solid for points inside the triangle, but does not account for points on the triangle edges. The check I added in 6355a84 is not fool-proof and can result in double positives turning into negatives.

To add an on-edge check for the 3 sides of the triangle for each case that fails to be inside the triangle is too expensive and ends up being as slow as the original method.

@shehzan10
Copy link
Member Author

shehzan10 commented Jan 24, 2019

We may have to give up on this altogether. The algorithm is quite solid for points inside the triangle, but does not account for points on the triangle edges. The check I added in 6355a84 is not fool-proof and can result in double positives turning into negatives.

The one thing that might make this work is if we add an epsilon to the triangle vertices, making it very slightly larger, and then checking with barycentric coordinates like @likangning93 suggested. This will make computeBarycentricCoordinates run more than the one time, but also guarantee that on-edge points are correctly found.

Edit: nevermind, I may have spoken too soon. "Adding an epsilon" depends on a direction which will get complicated again.

@shehzan10
Copy link
Member Author

The fix I pushed with the scaling is running in 7000ms. Commenting out the computeBarycentricCoordinates checks doesn't improve the performance since it's being hit minimal number of times.

So probably the best thing to do is to close this PR and enable use of minified version, which is faster than the scaling fix. :-(

@likangning93
Copy link
Contributor

@shehzan10 thanks for investigating. One more thought, can we get similar benefits with a simpler but even more permissive culling check? Perhaps a bounding box check instead of a raycasting? It seems like that could still "cull" a majority of triangles in a typical query from having barycentric coordinates computed.

@mramato
Copy link
Contributor

mramato commented Jan 25, 2019

@shehzan10 can you confirm that #7514 really does provide you a speed-up here?

@shehzan10
Copy link
Member Author

@shehzan10 can you confirm that #7514 really does provide you a speed-up here?

Yes, it does.

NODE_ENV=development node ./getHeightBench.js                                                                                                                                                                                     
Using development mode // I added a comment in index.js to confirm
World Terrain Created
test: 10119.665ms
done

node ./getHeightBench.js
Using production mode // I added a comment in index.js to confirm
World Terrain Created
test: 4803.156ms
done

@shehzan10
Copy link
Member Author

shehzan10 commented Jan 25, 2019

@shehzan10 thanks for investigating. One more thought, can we get similar benefits with a simpler but even more permissive culling check? Perhaps a bounding box check instead of a raycasting? It seems like that could still "cull" a majority of triangles in a typical query from having barycentric coordinates computed.

I tested this out and definitely faster in development mode, with a smaller advantage in production.

Instead of using the full BoundingRectangle, I used a simpler version that is more optimized for this use case with min/max. Using the BoundingRectangle made the "optimization" 4x slower in development mode.

Edit: This is also faster than using "slightly larger triangle" with the raycast test.

@likangning93
Copy link
Contributor

likangning93 commented Jan 25, 2019

Thanks @shehzan10! Code changes look good, and from offline discussion, bounding box check looks to still be ~2x better than master in production, which is still a nice win even if it isn't quite as fast as raycast.

@likangning93 likangning93 merged commit 906adcc into CesiumGS:master Jan 25, 2019
@shehzan10
Copy link
Member Author

Thanks for helping with the triage and ideas. Couldn't get it to be as fast as I'd initially hoped, but still better than it was before.

@shehzan10 shehzan10 deleted the optimize-interpolateHeight branch January 25, 2019 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants