Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment with tolerance2 #4489

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ghislainp
Copy link
Contributor

@ghislainp ghislainp commented Oct 5, 2020

  • Closes tolerance for alignment #2217
  • Tests added
  • Passes isort . && black . && mypy . && flake8
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

Reading #2217, I've implemented fast algorithms for union and intersection of arrays with numerical tolerance.
This works fine in the "normal" case, when each array has all its values different (outside the tolerance) and the first and second arrays have some values in common within the tolerance. Conversely, the behavior is not well defined when one array has some values that are within the tolerance (or are equal) of each other. In this case, the behavior of union and intersection is not well defined anyway. The "bad" cases could be checked to raise an Exception (duplicate values within the tolerance), but this is not implemented yet.

I've also implemented a function to test index equality within the tolerance.

At last, the logic of xarray.align has been changed to deal with the tolerance. It was not possible to avoid some changes in align.

I'd appreciate some tests and code review, because there are certainly some rough corners I've not though about.

@dcherian
Copy link
Contributor

Thanks @ghislainp . This is a substantial change, so it may take some time for one of us to look through it. Thank you for your patience.

@benbovy
Copy link
Member

benbovy commented May 4, 2021

Sorry for the very long wait @ghislainp.

This looks like a great case for the ongoing flexible indexes refactor (see design notes, #4979). This refactor consists of abstracting away index operations like label-based selection and alignment and delegate them to (built-in or 3rd-party) xarray.Index subclasses.

After the refactoring, I think that what you propose here would nicely fit in PandasIndex (see #5102) if that's a common case, or perhaps better in its own xarray.Index subclass.

Would you mind waiting a bit more for the new flexible index system to be operational before adding this feature? This will make the refactoring easier. Also, we still need to figure out how best to expose index-specific options (like tolerance here, but it could be various options depending on the index) in parts of the xarray API like align, so it might be best if we can avoid breaking changes and/or depreciation cycles as much as possible until then.

@ghislainp
Copy link
Contributor Author

yes, no problem, take your time. This is still a wanted feature in any case, I've recently needed it again. I've found a solution by converting the real index to integer after adequate scaling, but an integrated solution would be much better.

@djhoese
Copy link
Contributor

djhoese commented Dec 14, 2023

I'm wondering what the future of this PR is now that the flexible indexes refactor should be done? I've run into issues with my own workflows/libraries that would benefit from more flexible tolerances when comparing coordinates and found this when researching. I'm not sure this PR would fix the issues I'm having but am still curious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tolerance for alignment
4 participants