-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add numpy interface #210
Add numpy interface #210
Conversation
Oh and I also updated the versions and removed an unused dependency in the pre-commit config, and |
Superb! 👍 |
@property | ||
def __array_interface__(self) -> dict[str, Any]: | ||
if self._data is None: | ||
self.load() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really load it? Maybe it's not wanted by the user (some cropping or loading only a single band might be needed instead). I would rather raise an error.
Fantastic! It will be really useful. |
I'm gonna disagree with you there @adehecq ;) I think that applying a numpy function to a Raster is a pretty clear indication that you expect the data to be loaded. I think that this 'lazy' approach, where we load the data in for the user automatically, is pretty sensible. |
I also disagree :-) The reason why this implementation was brought up was that, in xdem functions one could provide the raster instead of a numpy array. In these cases, it might not be so obvious that at some point all the data will be loaded into memory, to for example just calculate a mean. A typical use case would be: I know, it seems quite specific, but
|
Alright, I think I can see the issue, although it remains a bit difficult for me to imagine! But easy either way on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing indeed, well done 👍
For the self.load() issue brought up by Amaury, I think I agree more with @atedstone and would encourage a lazy approach. It's the responsibility of the user to ensure he does not perform any calculation if he does not want the data to be loaded
As was discussed GlacioHack/xdem#145, allowing numpy functions directly on a raster object would be very nice to have (it would just point to self._data):
This turned out to be a quite simple fix! numpy has the
__array_interface__
which is used for all of their functionalities. So I just added a property that first loads the data if needed and then provides the__array_interface__
ofself._data
. As you can see in the tests, it seems to work quite well!