Skip to content

Latest commit

 

History

History
33 lines (23 loc) · 1.38 KB

README.md

File metadata and controls

33 lines (23 loc) · 1.38 KB

Spatial Join Performance

This code compares performing a spatial join of a population dataset with polygons of counties in the USA in four ways:

  1. A naive double loop
  2. Using a cheap county envelope pre-check
  3. Checking a state's envelope first, then checking the state's counties.
  4. Using an RTree of the county polygons.

On my machine, (1) took 652.1 seconds, (2) took 13.8s, (3) took 3.4s, and (4) took 1.3s.

These results are not deeply rigorous, nor are the algorithms particularly optimized. Additionally, only the outer shells of polygons are used -- holes are ignored completely. They are only intended to get order-of-magnitude results.

To run the performance measurements

  1. Install Rust
  2. Clone this repo.
  3. In the root directory, run cargo build --release && target/release/presto_spatial_join_blog

The brute force calculation can take over 10 minutes, so watch a video from Lessons from the Screenplay.

Acknowledgements

Population centers come from Facebook's Population Density Maps.

County and State geojson files come from Eric Celeste, who sourced the data from the US Census Bureau.