-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to pass coordinates to the grid method #326
Conversation
Start drafting the addition of an optional coordinates argument to the grid method of the BaseGridder. It would allow use to predict on preexisting grids and obtain a xr.Dataset as output. Add a new example that passes the coordinates of a preexisting grid to a Chain with a Spline inside. Write a new meshgrid_from_1d function that creates a meshgrid out of 1d horizontal arrays, but also checks if they have a single dimensions and supports for passing extra coordinates as 2d-arrays.
… allow-coords-in-grid
Rename variable to avoid overriding the grid_coordinates function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few suggestions but nothing major. Really good work, as always 🙂
coordinates = (easting, northing) | ||
|
||
# Use the .grid() method to predict on the grid coordinates | ||
grid = gridder.grid(coordinates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using coordinates
for the grid coordinates is going to be confusing. Probably best to use coordinates
for data coordinates and grid_coordinates
for grid coordinates throughout the code and examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agreed at first, but then I found that we already use grid_coordinates
for the function that creates the coords of a regular grid. Should we use a different name for the coordinates argument? Maybe grid_coords
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right. grid_coords
is reasonable I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having second thoughts on this. Is really an issue of having coordinates
argument on methods like predict
and in grid
while the requirements are slightly different? I think having a common coordinates
arguments for methods and functions in Verde is a feature and not an issue:
- I can autocomplete on every call by just typing
coor<TAB>
(withgrid_coords
it wouldn't work, I would need to delete and start typinggri
). - On some places it doesn't mind if the passed coordinates are scattered or regular gridded (forward modelling in Harmonica for instance). So those cases would be an exception where
coordinates
should be used, although grid coordinates could also be passed. - On the specific
BaseGridder
we would have two methods that do almost the same but for different kind of coordinates:predict
for scattered coordiantes,grid
for grid coordinates. I think that difference is sufficient for differentiating the distribution of the coords.
One last argument, if we use grid_coords
, an simple example of the grid
method would look like:
grid = gridder.grid(grid_coords=...)
I can see myself struggling to read that line out loud when teaching.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right. Typing spline.grid(grid_coords...)
is awkward. So here is a proposal:
- Function arguments are
coordinates
unless you need to specify 2 sets (likedistance_mask
). - For examples:
coordinates
means data coordinates andgrid_coords
means grid coordinates since there will usually be the 2 sets and we'll need to distinguish between them. That way the call could bespline.grid(coordinates=grid_coords...
orspline.grid(coordinates=vd.grid_coordinates...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I write this, I'm not entire happy with requiring giving coordinates all the time. But I don't know what else to do. I would want to write some examples using this syntax to see how it feels but my feeling is that it will be a bit of a blow to general usability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the two points above. Just a reminder for the future: don't underestimate the old design decisions, there might be a reason why we took that road even if we don't remember it.
Feel free to try it out on examples, there's no rush for this PR. If we are going to change the default behavior of the base class for Verde gridders, it's good to put a lot of thinking into it.
On the note about passing coordinates every time I think we are already doing it, but hiding it through the spacing
, shape
, region
, projection
arguments. So the grid
method currently does two things: creates a set of grid coordinates and predicts on them. I would like to split it in two:
- Leave the coordinate generation to the user. This way they can corroborate that the grid is the right one before they have to wait until the prediction is carried out. Or they can just use a predefined grid and skip this step.
- Use the
grid
method to predict on the given coordinates. It should also project the coords before running the prediction if anyprojection
is passed. This simplifies the usage of thegrid
method, makes it easier to generalize to gridders like EQL witheasting
,northing
andheight
(they can be part of thecoordinates
directly) and makes it easier to understand what theprojection
callback will be used for (I always struggle to think if the projection is applied before the prediction or after it and which coordinates should I use forregion
andspacing
).
Finally, I think this new grid
method will be doing a single task, and doing it well 😁 .
Co-authored-by: Leonardo Uieda <leouieda@gmail.com>
We will change every gridding example so they predefine grid coordinates and pass them as the `coordinates` argument.
@leouieda I've added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small comments on the docstring. Otherwise, this looks good to me 👍🏽
verde/base/base_classes.py
Outdated
should be a 2d dimensional grid with a shape of ``(northing.shape, | ||
easting.shape)``. The arrays could either be Numpy arrays or | ||
:class:`xarray.DataArray`s. If ``coordinates`` are passed, no | ||
``region``, ``shape`` nor ``spacing`` is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the part about the extra coordinates since in Verde itself they are ignored. Other libraries may implement this differently as well so it’s best to leave unspecified.
The arrays could be all sorts of things (pandas Series, cupy, dask, etc) so saying “array” is best without needing to specify types.
“If coordinates
are passes, region
, shape
, and spacing
are ignored.”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you like it better now.
Co-authored-by: Leonardo Uieda <leouieda@gmail.com>
Allow the
grid
method ofBaseGridder
to take the coordinates of thepreexisting grid either as 2d arrays (meshgrid) or as 1d arrays. Raise
a
FutureWarning
if thespacing
,shape
orregion
arguments are passed,since they will be removed in Verde
v2.0.0
. Write a newmeshgrid_from_1d
function that creates a meshgrid out of 1d horizontal arrays, but also checks
if they have a single dimensions and supports for passing extra coordinates as
2d-arrays.
This PR is an evolution of #285, so #285 should be closed if this PR is merged.
Reminders:
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
and the base__init__.py
file for the package.AUTHORS.md
file (if you haven't already) in case you'd like to be listed as an author on the Zenodo archive of the next release.