Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: support NaT values into datetime series for interpolation (#11701) #21851

Closed
wants to merge 3 commits into from

Conversation

atulagrwl
Copy link

  • closes Interpolate NaT #11701
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@jbrockmendel
Copy link
Member

Glad this is getting fixed, good work. Want to handle the time delta case while you’re at it?

@atulaggarwal
Copy link

atulaggarwal commented Jul 11, 2018

@jbrockmendel - Are you referring to #19199 ?

@jbrockmendel
Copy link
Member

#19199 is also about datetime64. I'm not sure if there is a separate issue for timedelta64[ns], but the behavior is the same.

@atulaggarwal
Copy link

Okay. I will take a look at both #19199 and timedelta64[ns]

@@ -1317,3 +1317,23 @@ def test_series_interpolate_intraday(self):
result = ts.reindex(new_index).interpolate(method='time')

tm.assert_numpy_array_equal(result.values, exp.values)

def test_series_interpolate_nat(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tz_native_fixture and paramerize

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest commit.

result = result.interpolate()
tm.assert_series_equal(result, expected)

def test_series_interpolate_nat_inplace(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can parameterize over inplace as well

@@ -6112,6 +6112,14 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False,
raise NotImplementedError("Interpolation with NaNs in the index "
"has not been implemented. Try filling "
"those NaNs before interpolating.")
is_datetime = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be pushed down to the block in pandas/core/internals, should go with def _interpolate I think

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to explore more about it. Will do that in a day or two.

@jreback jreback added Enhancement Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jul 12, 2018
@jbrockmendel
Copy link
Member

@atulaggarwal needs rebase. We can try to push this through as is, or aim for the cleaner long-term solution of implementing interpolate on DatetimeArray and TimedeltaArray.

# Conflicts:
#	doc/source/whatsnew/v0.24.0.txt
@codecov
Copy link

codecov bot commented Aug 21, 2018

Codecov Report

Merging #21851 into master will decrease coverage by 0.13%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21851      +/-   ##
==========================================
- Coverage   92.05%   91.91%   -0.14%     
==========================================
  Files         169      164       -5     
  Lines       50733    49998     -735     
==========================================
- Hits        46702    45958     -744     
- Misses       4031     4040       +9
Flag Coverage Δ
#multiple 90.3% <100%> (-0.16%) ⬇️
#single 42.15% <0%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 96.47% <100%> (+0.03%) ⬆️
pandas/util/_doctools.py 0% <0%> (-12.88%) ⬇️
pandas/core/reshape/util.py 90.32% <0%> (-9.68%) ⬇️
pandas/util/_decorators.py 82.25% <0%> (-9.09%) ⬇️
pandas/core/common.py 92.19% <0%> (-5.15%) ⬇️
pandas/tseries/frequencies.py 95.9% <0%> (-1.18%) ⬇️
pandas/core/indexes/interval.py 93.18% <0%> (-0.98%) ⬇️
pandas/core/indexes/datetimelike.py 96.94% <0%> (-0.83%) ⬇️
pandas/plotting/_converter.py 63.2% <0%> (-0.69%) ⬇️
pandas/core/arrays/period.py 91.35% <0%> (-0.64%) ⬇️
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f11d1a...76f050a. Read the comment docs.

@atulaggarwal
Copy link

@jbrockmendel - Updated PR with latest master. I would suggest merging it as is. I will work on working on other items too (but lets not block on them) as current PR solves one of the problem. What do you suggest?

@@ -182,6 +182,8 @@ Other Enhancements
- :func:`to_timedelta` now supports iso-formated timedelta strings (:issue:`21877`)
- :class:`Series` and :class:`DataFrame` now support :class:`Iterable` in constructor (:issue:`2193`)
- :class:`DatetimeIndex` gained :attr:`DatetimeIndex.timetz` attribute. Returns local time with timezone information. (:issue:`21358`)
- Implement interpolating ``NaT`` values in ``datetime`` series (:issue:`11701`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording here is awkward. It doesn't need to be a complete sentence per se, but take a look at the other entries for examples.

@jbrockmendel
Copy link
Member

I would suggest merging it as is.

As jreback mentioned, the implementation is not in the right place. The viable options are in core.internals.blocks.DatetimeLikeBlockMixin and core.arrays.datetimelike.DatetimeLikeArrayMixin

@jbrockmendel
Copy link
Member

I think this might give incorrect answers for values near the int64 bounds because casting to float will be lossy. Can you add a test for that?

@jreback
Copy link
Contributor

jreback commented Nov 23, 2018

closing as stale. if you'd like to continue, pls ping.

@jreback jreback closed this Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Enhancement Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interpolate NaT
4 participants