Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GridGeoSampler: don't change stride of last patch #1329

Merged
merged 2 commits into from
May 14, 2023
Merged

Conversation

adamjstewart
Copy link
Collaborator

Partially reverts #630. See #1245 for motivation.

@remtav

@adamjstewart adamjstewart added this to the 0.4.2 milestone May 12, 2023
@github-actions github-actions bot added samplers Samplers for indexing datasets testing Continuous integration testing labels May 12, 2023
@isaaccorley
Copy link
Collaborator

isaaccorley commented May 12, 2023

Are we sure this works? I thought that to read data including those outside the bounds of an image we have to pass boundless=True like

dest = src.read(
   indexes=band_indexes,
   out_shape=out_shape,
   window=from_bounds(*bounds, src.transform),
   boundless=True
)

@adamjstewart
Copy link
Collaborator Author

adamjstewart commented May 13, 2023

Good catch! Without boundless=True, rasterio still gives us an image with the correct shape because we pass in out_shape, but it does so by upsampling the image to match the dimensions. This is why the tests weren't failing. We want it to give nodata pixels instead. Let me fix this.

@github-actions github-actions bot added the datasets Geospatial or benchmark datasets label May 13, 2023
@@ -487,6 +487,7 @@ def _merge_files(
indexes=band_indexes,
out_shape=out_shape,
window=from_bounds(*bounds, src.transform),
boundless=True,
)
else:
dest, _ = rasterio.merge.merge(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge has no boundless parameter, but doesn't need one. I confirmed that it properly samples outside the bounds of the file and uses nodata pixels for padding.

@adamjstewart adamjstewart merged commit 4b9a6d7 into main May 14, 2023
@adamjstewart adamjstewart deleted the fixes/grid-bounds branch May 14, 2023 15:59
@adamjstewart
Copy link
Collaborator Author

adamjstewart commented May 16, 2023

This introduces a serious bug. Changing the CRS of a dataset results in the following error when loading data:

Traceback (most recent call last):
  File "/Users/Adam/torchgeo/test.py", line 11, in <module>
    ds[ds.bounds]
  File "/Users/Adam/torchgeo/torchgeo/datasets/geo.py", line 899, in __getitem__
    samples = [ds[query] for ds in self.datasets]
  File "/Users/Adam/torchgeo/torchgeo/datasets/geo.py", line 899, in <listcomp>
    samples = [ds[query] for ds in self.datasets]
  File "/Users/Adam/torchgeo/torchgeo/datasets/geo.py", line 443, in __getitem__
    data = self._merge_files(filepaths, query, self.band_indexes)
  File "/Users/Adam/torchgeo/torchgeo/datasets/geo.py", line 486, in _merge_files
    dest = src.read(
  File "rasterio/_warp.pyx", line 1233, in rasterio._warp.WarpedVRTReaderBase.read
ValueError: WarpedVRT does not permit boundless reads

The fix is to use rasterio.merge.merge for all I/O instead.

@adamjstewart adamjstewart mentioned this pull request May 17, 2023
2 tasks
@adamjstewart adamjstewart modified the milestones: 0.4.2, 0.5.0 Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets samplers Samplers for indexing datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants