Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add zonal_stats #52

Merged
merged 88 commits into from
Dec 14, 2023
Merged
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
2a2e504
example
masawdah Jul 20, 2023
e1e0501
assign the pixels ids based on their polygons then group them using t…
masawdah Jul 20, 2023
4a3d144
fix importing xagg and run an example
masawdah Jul 21, 2023
e3a2443
xagg using openEO data , it needs to implement parallel processing to…
masawdah Jul 21, 2023
d48a019
clean up
masawdah Sep 22, 2023
4ee3653
spatial_aggregation
masawdah Sep 25, 2023
5c008a5
clean up
masawdah Sep 25, 2023
9e5e462
rioxarray
masawdah Sep 25, 2023
35121fe
packages
masawdah Sep 25, 2023
15c04ca
deps
masawdah Sep 25, 2023
5c2a97e
deps
masawdah Sep 25, 2023
f732a4a
deps
masawdah Sep 25, 2023
6fab486
fix the notes
masawdah Sep 27, 2023
fb1e4a1
zonal stats in pytest
masawdah Sep 27, 2023
6f2dbab
handle long lines
masawdah Sep 27, 2023
ce18648
handle long lines
masawdah Sep 27, 2023
4197063
fix pre-commit
masawdah Sep 27, 2023
b291d9f
updated files
masawdah Sep 27, 2023
64a976f
support lat&lon names
masawdah Sep 28, 2023
ae9ece0
fix the dimension names on the fly
masawdah Sep 28, 2023
5b808e0
testpy
masawdah Sep 28, 2023
ce594ba
dask support just axis as integer
masawdah Sep 28, 2023
bc2abeb
Dimension Mapping
masawdah Sep 29, 2023
440b698
updated files
masawdah Sep 29, 2023
0a3d081
[pre-commit.ci] pre-commit autoupdate (#53)
pre-commit-ci[bot] Oct 3, 2023
718c414
MAINT: test on Python 3.12, update actions, use ruff format (#54)
martinfleis Oct 27, 2023
2ab71cf
Update xvec/accessor.py
masawdah Oct 28, 2023
7590f58
Update xvec/accessor.py
masawdah Oct 28, 2023
9ee0df3
Update xvec/accessor.py
masawdah Oct 28, 2023
5add89e
Update xvec/accessor.py
masawdah Oct 28, 2023
4230333
clean uo
masawdah Oct 28, 2023
f485243
clean accessor
masawdah Oct 28, 2023
cafa811
remove dask option
masawdah Oct 28, 2023
29d522e
clean
masawdah Oct 28, 2023
8bb8a1f
internal funcs
masawdah Oct 28, 2023
5d37d9d
pytest
masawdah Oct 28, 2023
81343dc
clean up
masawdah Oct 28, 2023
5f125a3
clean up
masawdah Oct 28, 2023
5122dbd
example
masawdah Jul 20, 2023
de711b5
assign the pixels ids based on their polygons then group them using t…
masawdah Jul 20, 2023
e214e96
fix importing xagg and run an example
masawdah Jul 21, 2023
3a0b15b
xagg using openEO data , it needs to implement parallel processing to…
masawdah Jul 21, 2023
2bc11b6
clean up
masawdah Sep 22, 2023
5edcc20
spatial_aggregation
masawdah Sep 25, 2023
cfcf9da
clean up
masawdah Sep 25, 2023
593607c
rioxarray
masawdah Sep 25, 2023
d6c4046
packages
masawdah Sep 25, 2023
100f92d
deps
masawdah Sep 25, 2023
7e2ee33
deps
masawdah Sep 25, 2023
4541184
deps
masawdah Sep 25, 2023
1f4bdb5
fix the notes
masawdah Sep 27, 2023
d566fca
zonal stats in pytest
masawdah Sep 27, 2023
b786ec8
handle long lines
masawdah Sep 27, 2023
f7703fe
handle long lines
masawdah Sep 27, 2023
8e6c2d5
fix pre-commit
masawdah Sep 27, 2023
84e327e
updated files
masawdah Sep 27, 2023
df466fa
support lat&lon names
masawdah Sep 28, 2023
53ef66a
fix the dimension names on the fly
masawdah Sep 28, 2023
2a33ce5
testpy
masawdah Sep 28, 2023
d1fb637
dask support just axis as integer
masawdah Sep 28, 2023
ed94bf9
Dimension Mapping
masawdah Sep 29, 2023
75f71df
updated files
masawdah Sep 29, 2023
6a531ac
clean uo
masawdah Oct 28, 2023
de4d36c
clean accessor
masawdah Oct 28, 2023
4c91689
remove dask option
masawdah Oct 28, 2023
a73220d
clean
masawdah Oct 28, 2023
040073d
internal funcs
masawdah Oct 28, 2023
848236d
pytest
masawdah Oct 28, 2023
1ec1110
clean up
masawdah Oct 28, 2023
d573e31
Update xvec/accessor.py
masawdah Oct 28, 2023
ba20f48
accessor
masawdah Oct 28, 2023
7b48a6c
Merge branch 'spatial_aggregation' of https://github.com/masawdah/xve…
masawdah Oct 28, 2023
7b5dd1f
dimension-agnostic
masawdah Nov 3, 2023
f63a662
clean up
masawdah Nov 3, 2023
5b4a89a
update pytest
masawdah Nov 3, 2023
20c88bd
refactor
martinfleis Dec 13, 2023
7060895
refactor structure
martinfleis Dec 13, 2023
bc42888
geodatasets
martinfleis Dec 13, 2023
c4e8946
pyogrio for IO
martinfleis Dec 13, 2023
2a3bbd1
api
martinfleis Dec 13, 2023
2e40779
include vectorized rasterize-based method
martinfleis Dec 13, 2023
707f7e9
rasterio link
martinfleis Dec 13, 2023
aa7cb1a
fix docstring
martinfleis Dec 13, 2023
cd00342
fmt
martinfleis Dec 13, 2023
cc51123
fix for DataArray
martinfleis Dec 13, 2023
ff94fd4
testing
martinfleis Dec 14, 2023
4a3d77d
stat -> stats
martinfleis Dec 14, 2023
c230edb
fix keyword
martinfleis Dec 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ jobs:
runs-on: ubuntu-18.04

steps:
- uses: actions/checkout@master
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: "3.x"

Expand All @@ -24,7 +24,7 @@ jobs:
python -m build

- name: Publish distribution to PyPI
uses: pypa/gh-action-pypi-publish@master
uses: pypa/gh-action-pypi-publish@v1
with:
user: __token__
password: ${{ secrets.pypi_password }}
Expand Down
9 changes: 5 additions & 4 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,14 @@ jobs:
strategy:
matrix:
os: [macos-latest, ubuntu-latest, windows-latest]
environment-file: [ci/latest.yaml]
environment-file: [ci/312.yaml]
include:
- environment-file: ci/39.yaml
os: ubuntu-latest
- environment-file: ci/310.yaml
os: ubuntu-latest
- environment-file: ci/311.yaml
os: ubuntu-latest
- environment-file: ci/dev.yaml
os: ubuntu-latest
defaults:
Expand All @@ -31,14 +33,13 @@ jobs:

steps:
- name: checkout repo
uses: actions/checkout@v2
uses: actions/checkout@v4

- name: setup micromamba
uses: mamba-org/provision-with-micromamba@main
uses: mamba-org/setup-micromamba@v1
with:
environment-file: ${{ matrix.environment-file }}
micromamba-version: "latest"
channel-priority: "flexible"

- name: Install xvec
run: pip install .
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -144,3 +144,5 @@ doc/source/generated
.ruff_cache
doc/source/cube.joblib.compressed
doc/source/cube.pickle

cache/
8 changes: 2 additions & 6 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
files: 'xvec\/'
repos:
- repo: https://github.com/psf/black
rev: 23.7.0
hooks:
- id: black
language_version: python3
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: "v0.0.280"
rev: v0.1.2
hooks:
- id: ruff
- id: ruff-format

ci:
autofix_prs: false
Expand Down
9 changes: 8 additions & 1 deletion ci/310.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,19 @@ channels:
dependencies:
- python=3.10
# required
- shapely=2
- shapely >=2
- xarray
- rioxarray
- joblib
- rasterio
- tqdm
- pyproj
# testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-reportlog
- geopandas-base
- geodatasets
- pyogrio

11 changes: 9 additions & 2 deletions ci/latest.yaml → ci/311.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,21 @@ name: xvec
channels:
- conda-forge
dependencies:
- python
- python=3.11
# required
- shapely=2
- shapely >=2
- xarray
- rioxarray
- joblib
- rasterio
- tqdm
- pyproj
# testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-reportlog
- geopandas-base
- geodatasets
- pyogrio

22 changes: 22 additions & 0 deletions ci/312.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: xvec
channels:
- conda-forge
dependencies:
- python=3.12
# required
- shapely >=2
- xarray
- rioxarray
- joblib
- rasterio
- tqdm
- pyproj
# testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-reportlog
- geopandas-base
- geodatasets
- pyogrio

8 changes: 7 additions & 1 deletion ci/39.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,18 @@ channels:
dependencies:
- python=3.9
# required
- shapely=2
- shapely >=2
- xarray
- rioxarray
- joblib
- rasterio
- tqdm
- pyproj
# testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-reportlog
- geopandas-base
- geodatasets
- pyogrio
6 changes: 6 additions & 0 deletions ci/dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,18 @@ dependencies:
- cython
- geos
- proj
- rioxarray
- joblib
- rasterio
- tqdm
# testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-reportlog
- geopandas-base
- geodatasets
- pyogrio
- pip
- pip:
- git+https://github.com/shapely/shapely.git@main
Expand Down
4 changes: 3 additions & 1 deletion doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ Methods
Dataset.xvec.to_geodataframe
Dataset.xvec.to_geopandas
Dataset.xvec.extract_points
Dataset.xvec.zonal_stats


DataArray.xvec
Expand Down Expand Up @@ -89,4 +90,5 @@ Methods
DataArray.xvec.query
DataArray.xvec.to_geodataframe
DataArray.xvec.to_geopandas
DataArray.xvec.extract_points
DataArray.xvec.extract_points
DataArray.xvec.zonal_stats
2 changes: 2 additions & 0 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@
"xarray": ("https://docs.xarray.dev/en/latest/", None),
"geopandas": ("https://geopandas.org/en/latest", None),
"pandas": ("https://pandas.pydata.org/docs", None),
"rasterio": ("https://rasterio.readthedocs.io/en/latest/", None),

}

# -- Options for HTML output -------------------------------------------------
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,4 @@ line-length = 88
select = ["E", "F", "W", "I", "UP", "B", "A", "C4", "Q"]
exclude = ["doc"]
target-version = "py39"
ignore = ['E501', 'Q000', 'Q001', 'Q002', 'Q003', 'W191']
118 changes: 116 additions & 2 deletions xvec/accessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from pyproj import CRS, Transformer

from .index import GeometryIndex
from .zonal import _zonal_stats_iterative, _zonal_stats_rasterize


@xr.register_dataarray_accessor("xvec")
Expand Down Expand Up @@ -918,6 +919,119 @@ def to_geodataframe(
)
return df

def zonal_stats(
self,
polygons: Sequence[shapely.Geometry],
x_coords: Hashable,
y_coords: Hashable,
stats: str = "mean",
name: Hashable = "geometry",
index: bool = None,
method: str = "rasterize",
all_touched: bool = False,
n_jobs: int = -1,
**kwargs,
):
"""Extract the values from a dataset indexed by a set of geometries

The CRS of the raster and that of polygons need to be equal.
Xvec does not verify their equality.

Parameters
----------
polygons : Sequence[shapely.Geometry]
An arrray-like (1-D) of shapely geometries, like a numpy array or
:class:`geopandas.GeoSeries`.
x_coords : Hashable
name of the coordinates containing ``x`` coordinates (i.e. the first value
in the coordinate pair encoding the vertex of the polygon)
y_coords : Hashable
name of the coordinates containing ``y`` coordinates (i.e. the second value
in the coordinate pair encoding the vertex of the polygon)
stats : string
Spatial aggregation statistic method, by default "mean". It supports the
following statistcs: ['mean', 'median', 'min', 'max', 'sum']
name : Hashable, optional
Name of the dimension that will hold the ``polygons``, by default "geometry"
index : bool, optional
If `polygons` is a GeoSeries, ``index=True`` will attach its index as another
coordinate to the geometry dimension in the resulting object. If
``index=None``, the index will be stored if the `polygons.index` is a named
or non-default index. If ``index=False``, it will never be stored. This is
useful as an attribute link between the resulting array and the GeoPandas
object from which the polygons are sourced.
method : str, optional
The method of data extraction. The default is ``"rasterize"``, which uses
:func:`rasterio.features.rasterize` and is faster, but can lead to loss
of information in case of small polygons. Other option is ``"iterate"``, which
iterates over polygons and uses :func:`rasterio.features.geometry_mask`.
all_touched : bool, optional
If True, all pixels touched by geometries will be considered. If False, only
pixels whose center is within the polygon or that are selected by
Bresenham’s line algorithm will be considered.
n_jobs : int, optional
Number of parallel threads to use. It is recommended to set this to the
number of physical cores of the CPU. ``-1`` uses all available cores. Applies
only if ``method="iterate"``.
**kwargs : optional
Keyword arguments to be passed to the aggregation function
(e.g., ``Dataset.mean(**kwargs)``).

Returns
-------
Dataset
A subset of the original object with N-1 dimensions indexed by
the the GeometryIndex.

"""
# TODO: allow multiple stats at the same time (concat along a new axis),
# TODO: possibly as a list of tuples to include names?
# TODO: allow callable in stat (via .reduce())
if method == "rasterize":
result = _zonal_stats_rasterize(
self,
polygons=polygons,
x_coords=x_coords,
y_coords=y_coords,
stats=stats,
name=name,
all_touched=all_touched,
**kwargs,
)
elif method == "iterate":
result = _zonal_stats_iterative(
self,
polygons=polygons,
x_coords=x_coords,
y_coords=y_coords,
stats=stats,
name=name,
all_touched=all_touched,
n_jobs=n_jobs,
**kwargs,
)
else:
raise ValueError(
f"method '{method}' is not supported. Allowed options are 'rasterize' "
"and 'iterate'."
)

# save the index as a data variable
if isinstance(polygons, pd.Series):
if index is None:
if polygons.index.name is not None or not polygons.index.equals(
pd.RangeIndex(0, len(polygons))
):
index = True
if index:
index_name = polygons.index.name if polygons.index.name else "index"
result = result.assign_coords({index_name: (name, polygons.index)})

# standardize the shape - each method comes with a different one
return result.transpose(
name, *tuple(d for d in self._obj.dims if d not in [x_coords, y_coords])
)

def extract_points(
self,
points: Sequence[shapely.Geometry],
Expand Down Expand Up @@ -957,15 +1071,15 @@ def extract_points(
crs : Any, optional
Cordinate reference system of shapely geometries. If ``points`` have a
``.crs`` attribute (e.g. ``geopandas.GeoSeries`` or a ``DataArray`` with
``"crs"`` in ``.attrs`), ``crs`` will be automatically inferred. For more
``"crs"`` in ``.attrs``), ``crs`` will be automatically inferred. For more
generic objects (numpy array, list), CRS shall be specified manually.
index : bool, optional
If `points` is a GeoSeries, ``index=True`` will attach its index as another
coordinate to the geometry dimension in the resulting object. If
``index=None``, the index will be stored if the `points.index` is a named
or non-default index. If ``index=False``, it will never be stored. This is
useful as an attribute link between the resulting array and the GeoPandas
object from which the points are sourced from.
object from which the points are sourced.

Returns
-------
Expand Down
3 changes: 2 additions & 1 deletion xvec/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ def _get_common_crs(crs_set: set[CRS | None]):
warnings.warn( # noqa: B028
"CRS not set for some of the concatenation inputs. "
f"Setting output's CRS as {names[0]} "
"(the single non-null CRS provided)."
"(the single non-null CRS provided).",
stacklevel=3,
)
return crs_not_none[0]

Expand Down
2 changes: 1 addition & 1 deletion xvec/tests/test_accessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from geopandas.testing import assert_geodataframe_equal
from pandas.testing import assert_frame_equal

import xvec # noqa
import xvec # noqa: F401
from xvec import GeometryIndex


Expand Down
Loading
Loading