Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix (Series|DataFrame).interpolate for datetime dtypes #19291

Closed
wants to merge 6 commits into from

Conversation

jbrockmendel
Copy link
Member

mask = np.atleast_2d(mask)
data = data.astype(np.float64)
data[mask] = np.nan

Copy link
Contributor

@jreback jreback Jan 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ths is a hodgepodge. use the hierarchy to make this code readable. IOW put the datetime stuff in the datetimeblock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the DatetimeBlock. I'm confused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then why would you need is_datetime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be be way less if/then stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then why would you need is_datetime.

It's specifically is_datetimetz; the tzaware case gets slightly different handling. If you're suggesting a separate implementation for DatetimeTZBlock vs DatetimeBlock I guess that's OK, but pretty sure there'll be complaints about code duplication that way.

@codecov
Copy link

codecov bot commented Jan 19, 2018

Codecov Report

Merging #19291 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19291      +/-   ##
==========================================
+ Coverage   91.53%   91.57%   +0.03%     
==========================================
  Files         150      148       -2     
  Lines       48738    48913     +175     
==========================================
+ Hits        44611    44790     +179     
+ Misses       4127     4123       -4
Flag Coverage Δ
#multiple 89.94% <100%> (+0.04%) ⬆️
#single 41.66% <5.88%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/internals.py 94.54% <100%> (ø) ⬆️
pandas/core/generic.py 95.91% <100%> (ø) ⬆️
pandas/core/categorical.py 95.78% <0%> (-4.22%) ⬇️
pandas/core/ops.py 92.17% <0%> (-0.94%) ⬇️
pandas/io/pytables.py 92.45% <0%> (ø) ⬆️
pandas/core/reshape/reshape.py 100% <0%> (ø) ⬆️
pandas/compat/pickle_compat.py 75.6% <0%> (ø) ⬆️
pandas/core/arrays/__init__.py
pandas/core/arrays/categorical.py
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f0cd23c...0115ef1. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you basically tripled the code here. pls don't do that. refactor into smaller more logical units.

dtype_counts = _maybe_transposed_self._data.get_dtype_counts()
if ('object' in dtype_counts and
dtype_counts.get('object') == len(_maybe_transposed_self.T)):
# Checking for 'object' lets us avoid sometimes-fragile tranpose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? where are you testing this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround until #19198 is fixed:

dti = pd.date_range('2016-01-01', periods=3, tz='US/Pacific').insert(1, pd.NaT)
ser = pd.Series(dti)
df = ser.to_frame()

>>> df.interpolate()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/generic.py", line 5155, in interpolate
    'object') == len(_maybe_transposed_self.T):
  File "pandas/core/frame.py", line 1941, in transpose
    return super(DataFrame, self).transpose(1, 0, **kwargs)
  File "pandas/core/generic.py", line 616, in transpose
    new_values = self.values.transpose(axes_numbers)
  File "pandas/core/base.py", line 701, in transpose
    nv.validate_transpose(args, kwargs)
  File "pandas/compat/numpy/function.py", line 54, in __call__
    self.defaults)
  File "pandas/util/_validators.py", line 218, in validate_args_and_kwargs
    validate_kwargs(fname, kwargs, compat_args)
  File "pandas/util/_validators.py", line 157, in validate_kwargs
    _check_for_default_values(fname, kwds, compat_args)
  File "pandas/util/_validators.py", line 69, in _check_for_default_values
    format(fname=fname, arg=key)))
ValueError: the 'axes' parameter is not supported in the pandas implementation of transpose()

(ser.interpolate doesn't raise, just forgets to interpolate)

@jbrockmendel
Copy link
Member Author

you basically tripled the code here. pls don't do that. refactor into smaller more logical units.

You complained about the existence of if/else conditions in the previous version when DatetimeBlock handled both the aware and unaware cases. Are you suggesting going back to that or something else entirely?

@jreback
Copy link
Contributor

jreback commented Jan 19, 2018

i am suggesting a refactor completely of interpolate actually

is very hacky now - this is just making it worse

to be honest i would prob not touch this now

many many other fish to fry

@jbrockmendel
Copy link
Member Author

OK. I'll put a pin in this until I can come up with something more elegant. Mind keeping it open for now so it doesn't fall off my radar screen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interpolating of datetime64 values ignored (at best)
2 participants