-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ExtensionArray types in where #24077
Labels
ExtensionArray
Extending pandas with custom dtypes or arrays.
Milestone
Comments
TomAugspurger
added
the
ExtensionArray
Extending pandas with custom dtypes or arrays.
label
Dec 3, 2018
This will also avoid converting to objects for categoricals. |
TomAugspurger
added a commit
to TomAugspurger/pandas
that referenced
this issue
Dec 5, 2018
We need some way to do `.where` on EA object for DatetimeArray. Adding it to the interface is, I think, the easiest way. Initially I started to write a version on ExtensionBlock, but it proved to be unwieldy. to write a version that performed well for all types. It *may* be possible to do using `_ndarray_values` but we'd need a few more things around that (missing values, converting an arbitrary array to the "same' ndarary_values, error handling, re-constructing). It seemed easier to push this down to the array. The implementation on ExtensionArray is readable, but likely slow since it'll involve a conversion to object-dtype. Closes pandas-dev#24077
TomAugspurger
added a commit
to TomAugspurger/pandas
that referenced
this issue
Dec 5, 2018
We need some way to do `.where` on EA object for DatetimeArray. Adding it to the interface is, I think, the easiest way. Initially I started to write a version on ExtensionBlock, but it proved to be unwieldy. to write a version that performed well for all types. It *may* be possible to do using `_ndarray_values` but we'd need a few more things around that (missing values, converting an arbitrary array to the "same' ndarary_values, error handling, re-constructing). It seemed easier to push this down to the array. The implementation on ExtensionArray is readable, but likely slow since it'll involve a conversion to object-dtype. Closes pandas-dev#24077
TomAugspurger
added a commit
to TomAugspurger/pandas
that referenced
this issue
Dec 5, 2018
commit 56470c3 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Wed Dec 5 11:39:48 2018 -0600 Fixups: * Ensure data generated OK. * Remove erroneous comments about alignment. That was user error. commit c4604df Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Mon Dec 3 14:23:25 2018 -0600 API: Added ExtensionArray.where We need some way to do `.where` on EA object for DatetimeArray. Adding it to the interface is, I think, the easiest way. Initially I started to write a version on ExtensionBlock, but it proved to be unwieldy. to write a version that performed well for all types. It *may* be possible to do using `_ndarray_values` but we'd need a few more things around that (missing values, converting an arbitrary array to the "same' ndarary_values, error handling, re-constructing). It seemed easier to push this down to the array. The implementation on ExtensionArray is readable, but likely slow since it'll involve a conversion to object-dtype. Closes pandas-dev#24077
TomAugspurger
added a commit
to TomAugspurger/pandas
that referenced
this issue
Dec 7, 2018
commit 9e0d87d Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Fri Dec 7 07:18:58 2018 -0600 update docs, cleanup commit 1271d3d Merge: 033ac9c f74fc59 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Fri Dec 7 07:12:49 2018 -0600 Merge remote-tracking branch 'upstream/master' into ea-where commit 033ac9c Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Fri Dec 7 06:30:18 2018 -0600 Setitem-based where commit e9665b8 Merge: 5e14414 03134cb Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 21:38:42 2018 -0600 Merge remote-tracking branch 'upstream/master' into ea-where commit 5e14414 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 09:18:54 2018 -0600 where versionadded commit d90f384 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 09:17:43 2018 -0600 deprecation note for categorical commit 4715ef6 Merge: edff47e b78aa8d Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 08:15:26 2018 -0600 Merge remote-tracking branch 'upstream/master' into ea-where commit edff47e Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 08:15:21 2018 -0600 32-bit compat commit badb5be Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Thu Dec 6 06:21:44 2018 -0600 compat, revert commit 911a2da Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Wed Dec 5 15:55:24 2018 -0600 debug 32-bit issue commit a69dbb3 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Wed Dec 5 15:49:17 2018 -0600 warn for categorical commit 6f79282 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Wed Dec 5 12:45:54 2018 -0600 32-bit compat commit 56470c3 Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Wed Dec 5 11:39:48 2018 -0600 Fixups: * Ensure data generated OK. * Remove erroneous comments about alignment. That was user error. commit c4604df Author: Tom Augspurger <tom.w.augspurger@gmail.com> Date: Mon Dec 3 14:23:25 2018 -0600 API: Added ExtensionArray.where We need some way to do `.where` on EA object for DatetimeArray. Adding it to the interface is, I think, the easiest way. Initially I started to write a version on ExtensionBlock, but it proved to be unwieldy. to write a version that performed well for all types. It *may* be possible to do using `_ndarray_values` but we'd need a few more things around that (missing values, converting an arbitrary array to the "same' ndarary_values, error handling, re-constructing). It seemed easier to push this down to the array. The implementation on ExtensionArray is readable, but likely slow since it'll involve a conversion to object-dtype. Closes pandas-dev#24077
jreback
pushed a commit
that referenced
this issue
Dec 10, 2018
Pingviinituutti
pushed a commit
to Pingviinituutti/pandas
that referenced
this issue
Feb 28, 2019
Pingviinituutti
pushed a commit
to Pingviinituutti/pandas
that referenced
this issue
Feb 28, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is blocking DatetimeArray. It's also a slight regression from 0.24, since things like
.where
on a DataFrame with period objects would work (via object dtype).I think the easiest place for this is by defining
ExtensionBlock.where
, and restricting it to cases where the dtype ofself
andother
match (so that the result dtype is the same).We can do this pretty easily for our EAs by performing the
.where
on_ndarray_values
. But_ndarray_values
isn't part of the EA interface yet. I'm not sure if we'll have time to properly design and implement a generic.where
for any ExtensionArray since there are a couple subtlies.Here's a start
There are a couple TODOs there, plus tests, and I'm sure plenty of edge cases.
The text was updated successfully, but these errors were encountered: