Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataArray.interp() : poor performance #2223

Closed
e-roux opened this issue Jun 9, 2018 · 6 comments · Fixed by #4069
Closed

DataArray.interp() : poor performance #2223

e-roux opened this issue Jun 9, 2018 · 6 comments · Fixed by #4069

Comments

@e-roux
Copy link
Contributor

e-roux commented Jun 9, 2018

Code Sample

Hello,

I performed a quick comparison of the newly introduced method ìnterp()` with an adapter (draft) to the sdf (scientific data format) library: https://gist.github.com/gwin-zegal/b955c3ef63f5ad51eec6329dd2e620be#file-array_sdf_interp-py

Code for a micro comparison (2D array) in python (include the above gist first):

arr = xr.DataArray(np.sort(np.sort(np.random.RandomState(123).rand(30,4), axis=0), axis=1),
                   coords=[('tension', np.arange(10, 40)),
                           ('resistance', np.linspace(100, 500, 4))])

res = {'xarray': [], 'c_sdf' : []}
x = np.logspace(1, 4, num=10, dtype=np.int16)
for size in x:
    new_tension = arr.tension[0].data + np.random.random_sample(size=size) * (arr.tension[-1].data - arr.tension[0].data)
    new_resistance = arr.resistance[0].data + np.random.random_sample(size=size) * (arr.resistance[-1].data - arr.resistance[0].data)

    interp_xr = %timeit -qo arr.interp({'tension': new_tension, 'resistance': new_resistance})
    res['xarray'].append(interp_xr)
    
    interp_c_sdf = %timeit -qo arr(new_tension, new_resistance)
    res['c_sdf'].append(interp_c_sdf)

Problem description

The time spent for array.interp()is growing exponentially... over two 2min (xarray internal interp) on my old machine compared to 9ms (C-SDF wrapper) for 10_000 interpolations.

The C-SDF code is slow (a copy of the array is performed and algorithms not so optimized), but xarray implementation is not usable in daily life on my machine!

{'xarray': [<TimeitResult : 6.46 ms ± 223 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 6.46 ms ± 88.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 6.99 ms ± 141 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 8.91 ms ± 52 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 19.8 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 112 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)>,
<TimeitResult : 584 ms ± 19.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>,
<TimeitResult : 2.63 s ± 43.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>,
<TimeitResult : 15.5 s ± 147 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)>,
<TimeitResult : 2min 23s ± 18.7 s per loop (mean ± std. dev. of 7 runs, 1 loop each)>],
'c_sdf': [<TimeitResult : 1.08 ms ± 7.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.09 ms ± 7.91 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.1 ms ± 12.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.13 ms ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.19 ms ± 7.76 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.32 ms ± 9.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 1.59 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)>,
<TimeitResult : 2.19 ms ± 31.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 3.51 ms ± 34.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>,
<TimeitResult : 9.27 ms ± 307 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)>]}

Performance issue on my machine or is it confirmed by others?

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8

xarray: 0.10.7
pandas: 0.23.0
numpy: 1.14.3
scipy: 1.1.0
netCDF4: None
h5netcdf: None
h5py: 2.8.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.5
distributed: 1.21.8
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.8.1
setuptools: 39.2.0
pip: 10.0.1
conda: 4.5.4
pytest: 3.4.1
IPython: 6.4.0
sphinx: 1.7.5

@fujiisoup
Copy link
Member

@gwin-zegal , thank you for using our new feature and reporting the issue.
I confirmed the poor performance of interp.

I will look inside later, whether problem is on our code or upstream (scipy.interpolate).

A possible workaround for your code is to change

arr.interp({'tension': new_tension, 'resistance': new_resistance})

to

arr.interp({tension': new_tension}).interp('resistance': new_resistance})

but it does not solve all the problems.

@e-roux
Copy link
Contributor Author

e-roux commented Jun 10, 2018

Thanks for your comment

had a deeper look at DataArray.interp and it computes a new DataArray with new coords given by the dict. I first thought i'll get a 1D array which is not the case (this is often the behavior I want). I then compared first 10_000 interpolations on sdf against 100_000_000 in xarray! It explains the gap.

Updated my comparison with the same behavior with scipy and sdf in a jupyter notebook on mybinder

It seems everything's well with xarray.

Extrapolation (linear first) would be a good feature too, I put an example at the end of the notebook about sdf interpolation/extrapolation possibilites (work for nd-arrays of dim 32 as with numpy)

@fujiisoup
Copy link
Member

fujiisoup commented Jun 10, 2018

Thanks for your deeper analysis.

It seems everything's well with xarray.

Happy to hear that.

I first thought i'll get a 1D array which is not the case (this is often the behavior I want).

Our interp is working orthogonally by default, so passing two arrays sized 10,000 will result in interpolation of 100,000,000 values.
In order to get a 1D array, you can pass two dataarrays with the same dimension,

new_tension = xr.DataArray(new_tension, dims='new_dim')
new_resistance = xr.DataArray(new_resistance, dims='new_dim')
arr.interp(tension=new_tension, resistance=new_resistance)

which gives a 1d array with the new dimension new_dim.
See here for the details.

@fujiisoup
Copy link
Member

I want to keep this issue open, as the performance can be increased for such a case.

In the above example,

arr.interp(tension=new_tension, resistance=new_resistance)

and

arr.interp(tension=new_tension).interp(resistance=new_resistance)

gives the same result (for 'linear' and 'nearest' methods), but the latter runs much faster.
This difference looks similar to the difference between our orthogonal indexing and vectorized indexing.
We may need orthogonal interpolation path, which would significantly increase the performance in some cases.

@e-roux
Copy link
Contributor Author

e-roux commented Jun 10, 2018

Ok, thank you the information. I first worked with the API
Clearly documented in the link you provided.

I noticed the attrs get lost after orthogonal interpolation (see the first/second plots of arr in mybinder, I might open a new issue for that

@stale
Copy link

stale bot commented May 10, 2020

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants