pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

kusanagee · 2018-04-22T19:05:13Z

Code Sample, a copy-pastable example if possible

tdi = pd.timedelta_range(0, periods=10, freq="10s")
dti = pd.date_range(0, periods=10, freq="1d")
tdi.data
dti.data

Problem description

pd.TimedeltaIndex and pd.DatetimeIndex are intended for the construction of time series, with support including tshift and resample. But attempting to access tdi.data or dti.data results in "ValueError: cannot include dtype 'M' in a buffer." This is inconvenient because it makes it difficult to access the underlying numpy array, for use in operations requiring that representation, for instance for plotting spectrograms in librosa (librosa.display.specshow).

Given that a pandas.Index object is immutable, it probably suffices to construct a time series index by first creating an underlying numpy array variable, and then keeping that numpy array variable and passing it instead of the index.data wherever one might want to do the latter. But that's clunky and introduces the potential for errors.

This problem seems to arise upstream from numpy.datetime64's lack of support for buffers, though I can't say I really understand. Nevertheless I'm raising the issue here in pandas, to highlight an incompatibility between pandas' typical grammar (i.e., the Index.data attribute) and implementation of pd.DatetimeIndex and pd.TimedeltaIndex. This issue is related to Numpy issue #7270.

Expected Output

A memory view, like
<memory at 0x1c27393288>

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: en_US.UTF-8 pandas: 0.22.0 pytest: None pip: 9.0.1 setuptools: 38.4.0 Cython: None numpy: 1.14.1 scipy: 1.0.1 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.7.1 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2018.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.2 openpyxl: None

The text was updated successfully, but these errors were encountered:

jreback · 2018-04-23T10:34:26Z

.data is not a public property, and is fully deprecated in #20721

you should simply use .values if you really want the numpy array.

jreback closed this as completed Apr 23, 2018

jreback added Usage Question Compat pandas objects compatability with Numpy or Python functions labels Apr 23, 2018

jreback added this to the No action milestone Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

kusanagee commented Apr 22, 2018

jreback commented Apr 23, 2018

pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

Comments

kusanagee commented Apr 22, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

jreback commented Apr 23, 2018

Output of `pd.show_versions()`