faster getpoint #369

rafaqz · 2023-02-22T21:09:04Z

getpoint is pretty slow currently, to the point rasterizing a shapefile from ArchGDAL.jl is 20x slower than one from Shapefile.jl. Mostly because getting points allocates every time.

This PR greatly (but not completely) reduces the gap by:

Not returning GDAL points. They are pretty inefficient when we have thousands of points. A Tuple is better.
Preallocating Ref for getpoint! in the iterator for Geointerfce.getpoint(linestring), and reusing them for each point.

This is implemented for line string but that means polygon will also work. It's probably not worth the effort of passing the Ref through from polygons, but I haven't checked. There may be more objects that need this treatment.

Edit: I assumed this would have a test already, but seems not.

evetion · 2023-02-24T20:20:06Z

I was slightly hesitant about changing the return type here, but I think this still fits as it qualifies as point.

We could better document that we essentially are changing GeoInterface.coordinates to be a nested Vector of a Tuple instead of a Vector. That has impact on some construction methods as well.

rafaqz · 2023-02-24T21:54:25Z

Yeah, I tried to optimise the gdal point first but the overhead of allocating them all individually is just really bad.

Once all the methods here accept any point it won't make much difference either.

And yes I've noticed there's still a lot of nested vector around. It would be a good performance gain to swap them all to tuples.

rafaqz · 2023-02-24T22:49:33Z

Another thought I had writing this is it would be faster to use getx, gety etc because they don't need allocations at all.

But it assumes trad point order, so I went with getpoint!.

yeesian

I'd err on the side of assuming this is a breaking/major change.

There may be more objects that need this treatment. [...] Once all the methods here accept any point it won't make much difference either. [...] And yes I've noticed there's still a lot of nested vector around. It would be a good performance gain to swap them all to tuples.

Can we do it all within a single PR? I don't think it'd be a good idea to have inconsistencies in return types across geometry types.

src/geointerface.jl

rafaqz · 2023-02-25T16:21:00Z

I guess its a breaking change? But calling GeoInerface methods returns a lot of types of points already, and it doesn't promise what kind of point comes from what parent object.

I also doubt it will actually break much.

We should actually put this on getgeom and I guess it's just multipoint missing? I'm not sure that will be faster because it's accessed differently.

yeesian

LGTM because of #369 (comment) --

I was slightly hesitant about changing the return type here, but I think this still fits as it qualifies as point.

Will get a second opinion from @evetion nevertheless just in case there's anything other context from GeoInterface I might be missing

evetion · 2023-03-31T11:39:39Z

Do we have a roundtrip test? I.e. ArchGDAL.createpoint(GeoInterface.getpoint))? Similarly for GeoInterface.coordinates, if this PR changes that.

rafaqz · 2023-04-23T15:16:44Z

It doesn't touch coordinates. But yes round trip on points would be good. I think that should just work already.

Edit: a tuple is just a normal input for createpoint, it's already tested..

Edit2: I added round trip tests for LineaRing and LineString

visr

Nice speedup, looks good to me!

yeesian self-requested a review February 23, 2023 23:37

yeesian requested changes Feb 25, 2023

View reviewed changes

src/geointerface.jl Show resolved Hide resolved

src/geointerface.jl Show resolved Hide resolved

yeesian previously approved these changes Mar 30, 2023

View reviewed changes

yeesian requested a review from evetion March 30, 2023 23:44

rafaqz mentioned this pull request Apr 27, 2023

Load a GeoPackage dataset JuliaEarth/GeoTables.jl#13

Closed

rafaqz mentioned this pull request May 5, 2023

Zonal function - TaskFailedException rafaqz/Rasters.jl#438

Closed

rafaqz added 3 commits May 6, 2023 00:49

faster getpoint

0229f66

bugfix

b82574c

fix comment

2c033f3

rafaqz force-pushed the faster_getpoint branch from b81774a to 2c033f3 Compare May 5, 2023 22:55

getgeom rather than getpoint

8e721b0

rafaqz dismissed yeesian’s stale review via 8e721b0 May 5, 2023 23:28

rafaqz added 3 commits May 6, 2023 01:29

bugfix

dd593a9

fix multipoint

4961e40

round trips

9bda012

rafaqz force-pushed the faster_getpoint branch from 379d481 to 9bda012 Compare May 6, 2023 00:56

visr approved these changes May 6, 2023

View reviewed changes

yeesian approved these changes May 6, 2023

View reviewed changes

yeesian merged commit 3997df4 into yeesian:master May 6, 2023

henrik-wolf mentioned this pull request Mar 19, 2024

GeoInterface.distance breaks with new return type of GeoInterface.getgeom #419

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faster getpoint #369

faster getpoint #369

rafaqz commented Feb 22, 2023 •

edited

Loading

evetion commented Feb 24, 2023

rafaqz commented Feb 24, 2023

rafaqz commented Feb 24, 2023 •

edited

Loading

yeesian left a comment

rafaqz commented Feb 25, 2023

yeesian left a comment

evetion commented Mar 31, 2023

rafaqz commented Apr 23, 2023 •

edited

Loading

visr left a comment

faster getpoint #369

faster getpoint #369

Conversation

rafaqz commented Feb 22, 2023 • edited Loading

evetion commented Feb 24, 2023

rafaqz commented Feb 24, 2023

rafaqz commented Feb 24, 2023 • edited Loading

yeesian left a comment

Choose a reason for hiding this comment

rafaqz commented Feb 25, 2023

yeesian left a comment

Choose a reason for hiding this comment

evetion commented Mar 31, 2023

rafaqz commented Apr 23, 2023 • edited Loading

visr left a comment

Choose a reason for hiding this comment

rafaqz commented Feb 22, 2023 •

edited

Loading

rafaqz commented Feb 24, 2023 •

edited

Loading

rafaqz commented Apr 23, 2023 •

edited

Loading