-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to _get_nearest_indexer for pydata/xarray#3751 #32905
Fix to _get_nearest_indexer for pydata/xarray#3751 #32905
Conversation
The test I added fails in CI, because of another issue, which I proposed a downstream fix for in pydata/xarray#3874 (comment). |
Looks like |
It occurred to me we could write a more direct test that circumvents the |
Looks like test failures are happening because we are calling _filter_indexer_tolerance inconsistently (in master, unrelated to this PR). To fix this, I think we need to go back to your way of doing it: pass target._values to |
Cool, that seemed to do it; looks like things are all green now. Let me know if you have any further comments. |
ci/deps/azure-36-locale.yaml
Outdated
@@ -15,6 +15,7 @@ dependencies: | |||
|
|||
# pandas dependencies | |||
- beautifulsoup4 | |||
- cftime # Needed for downstream xarray.CFTimeIndex test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls move these near xarray
ci/deps/azure-macos-36.yaml
Outdated
@@ -14,6 +14,7 @@ dependencies: | |||
# pandas dependencies | |||
- beautifulsoup4 | |||
- bottleneck | |||
- cftime # Needed for downstream xarray.CFTimeIndex test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should not be in every CI yaml
rather u skip the test if if it’s not installed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand; I was just following @jbrockmendel's recommendation. Which CI yaml should I put this in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove from all
except the environment.yaml
I think this may be ready now. Does this require a what's new entry? If so, I notice this was tagged for a 1.0.4 release. Should I create a new what's new file for that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impl and tests look good. can you add a whatsnew in 1.1 (I am not sure we are actually doing a 1.0.4). put in bug fixes indexing. you can reference this PR number is fine (if you want to link to the xarray issue that's ok too).
Great, thanks @jreback. I added a what's new entry. Let me know if you'd like me to tweak it at all. I appreciate your and @jbrockmendel's help with this. |
doc/source/whatsnew/v1.1.0.rst
Outdated
@@ -310,6 +310,7 @@ Indexing | |||
- Bug in :meth:`DataFrame.iloc.__setitem__` on a :class:`DataFrame` with duplicate columns incorrectly setting values for all matching columns (:issue:`15686`, :issue:`22036`) | |||
- Bug in :meth:`DataFrame.loc:` and :meth:`Series.loc` with a :class:`DatetimeIndex`, :class:`TimedeltaIndex`, or :class:`PeriodIndex` incorrectly allowing lookups of non-matching datetime-like dtypes (:issue:`32650`) | |||
- Bug in :meth:`Series.__getitem__` indexing with non-standard scalars, e.g. ``np.dtype`` (:issue:`32684`) | |||
- :meth:`Index._get_nearest_indexer` now internally uses NumPy arrays or ExtensionArrays to compute left and right distances. This preserves the fix to :issue:`26683` while maintaining compatibility with downstream :class:`Index` subclasses, e.g. xarray's CFTimeIndex (:issue:`32905`). See also `pydata/xarray#3751 <https://github.com/pydata/xarray/issues/3751>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't reference an internal routine, rather only user visible things. You can just mention xarray.
lgtm ping on green. |
Many thanks @jreback; looks like things are green. |
thanks @spencerkclark very nice! keep em coming! |
…ydata/xarray#3751) (#34025) Co-authored-by: Spencer Clark <spencerkclark@gmail.com>
@jbrockmendel thanks for encouraging me to submit a PR. I'm totally open to any suggested changes; this is just a sketch of something that would work for our needs and hopefully not break anything in pandas.
To those not familiar, some changes introduced in #31511 broke some functionality of an Index subclass we define in xarray. More details/discussion can be found in pydata/xarray#3751 and pydata/xarray#3764, but the gist of the issue is that for better or for worse, we have been relying on left and right distances in
_get_nearest_indexer
being determined using NumPy arrays rather than Index objects (as they were prior to #31511).One way to fix this is to override
_get_nearest_indexer
within our Index subclass, which we've temporarily done in pydata/xarray#3764, but this potentially fragile. It would be great if we could find a solution here.black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff