-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: 2D compat for DTA tz_localize, to_period #37950
Conversation
pandas/core/arrays/datetimelike.py
Outdated
@@ -80,6 +81,26 @@ | |||
DatetimeLikeArrayT = TypeVar("DatetimeLikeArrayT", bound="DatetimeLikeArrayMixin") | |||
|
|||
|
|||
def ravel_compat(meth): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this actually could be in a more general location, maybe pandas/core/array_alogs/ somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe core.arraylike? For now im just excited about getting __array_function__
working on NDArrayBackedExtensionArray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, wait, the implementation here is specific to NDArrayBackedExtensionArray, will move it there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved + green
btw who should i bug about resample/BinGrouper? |
me or @mroeschke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the rationale?
pandas/core/arrays/datetimelike.py
Outdated
@@ -681,12 +681,13 @@ def value_counts(self, dropna=False): | |||
|
|||
cls = type(self) | |||
|
|||
result = value_counts(values, sort=False, dropna=dropna) | |||
result = value_counts(values.ravel("K"), sort=False, dropna=dropna) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for the ravel here? (value_counts is always done on a 1D array AFAIK)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this one cannot be ravel_compat (instead assert values.ndim == 1 here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated+green
Two short-term motivations. First, a number of places in DatetimeLikeBlock would be simpler if we just stored the EA as self.values, which requires more robust 2D support. Second, I've got a branch implementing |
Can you give a bit more details about which places that would be?
How is Also, IMO we should only add this for those functions for which we actually start using it. None of the changes here are actually already used somewhere? |
Most of these have |
I know you have been adding a lot of those comments, but having those comments doesn't mean there is consensus on further expanding the 2D capability of the datetimelike arrays .. We could also make the DatetimeBlock a 1D non-consolidatable block, and that would also solve the datetime-specific TODO(EA2D) comments |
We're going to do essentially that in ArrayManager, which we're implementing so we can try it out and see what the performance/complexity impact is. Fleshing out the 2D support for datetimelike arrays is part of making it possible to also try out the EA2D approach. |
I agree here. the array manager perf implications will determine our ultimate direction. e.g. if we get good performance then we should certainly move everything as it makes things code-wise way simpler. if not we have to re-think and potentially support 2d EA. |
pandas/core/arrays/datetimelike.py
Outdated
@@ -681,12 +681,13 @@ def value_counts(self, dropna=False): | |||
|
|||
cls = type(self) | |||
|
|||
result = value_counts(values, sort=False, dropna=dropna) | |||
result = value_counts(values.ravel("K"), sort=False, dropna=dropna) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this one cannot be ravel_compat (instead assert values.ndim == 1 here)
thanks @jbrockmendel |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff