Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas.DatetimeIndex.data and pandas.TimedeltaIndex.data fail with "ValueError: cannot include dtype 'M' in a buffer" #20788

Closed
kusanagee opened this issue Apr 22, 2018 · 1 comment
Labels
Compat pandas objects compatability with Numpy or Python functions Usage Question

Comments

@kusanagee
Copy link

Code Sample, a copy-pastable example if possible

tdi = pd.timedelta_range(0, periods=10, freq="10s")
dti = pd.date_range(0, periods=10, freq="1d")
tdi.data
dti.data

Problem description

pd.TimedeltaIndex and pd.DatetimeIndex are intended for the construction of time series, with support including tshift and resample. But attempting to access tdi.data or dti.data results in "ValueError: cannot include dtype 'M' in a buffer." This is inconvenient because it makes it difficult to access the underlying numpy array, for use in operations requiring that representation, for instance for plotting spectrograms in librosa (librosa.display.specshow).

Given that a pandas.Index object is immutable, it probably suffices to construct a time series index by first creating an underlying numpy array variable, and then keeping that numpy array variable and passing it instead of the index.data wherever one might want to do the latter. But that's clunky and introduces the potential for errors.

This problem seems to arise upstream from numpy.datetime64's lack of support for buffers, though I can't say I really understand. Nevertheless I'm raising the issue here in pandas, to highlight an incompatibility between pandas' typical grammar (i.e., the Index.data attribute) and implementation of pd.DatetimeIndex and pd.TimedeltaIndex. This issue is related to Numpy issue #7270.

Expected Output

A memory view, like
<memory at 0x1c27393288>

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: en_US.UTF-8 pandas: 0.22.0 pytest: None pip: 9.0.1 setuptools: 38.4.0 Cython: None numpy: 1.14.1 scipy: 1.0.1 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.7.1 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2018.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.2 openpyxl: None
@jreback
Copy link
Contributor

jreback commented Apr 23, 2018

.data is not a public property, and is fully deprecated in #20721

you should simply use .values if you really want the numpy array.

@jreback jreback closed this as completed Apr 23, 2018
@jreback jreback added Usage Question Compat pandas objects compatability with Numpy or Python functions labels Apr 23, 2018
@jreback jreback added this to the No action milestone Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Usage Question
Projects
None yet
Development

No branches or pull requests

2 participants