Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up netCDF4, h5netcdf backends #9067

Merged
merged 2 commits into from
Jun 4, 2024
Merged

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Jun 4, 2024

Accessing .shape on a netCDF4 variable is ~20-40ms. This can add up for large numbers of variables, e.g.: #9058

We already request the shape when creating NetCDF4ArrayWrapper, so we can reuse that.

xref pydata#9058

Accessing `.shape` on a netCDF4 variable is ~20-40ms.
This can add up for large numbers of variables, e.g.:
pydata#9058

We already request the shape when creating NetCDF4ArrayWrapper,
so we can reuse that.
@Illviljan Illviljan added the run-benchmark Run the ASV benchmark workflow label Jun 4, 2024
@dcherian
Copy link
Contributor Author

dcherian commented Jun 4, 2024

Turns out the same stuff works for h5netcdf too.

@dcherian dcherian changed the title Speed up netCDF4 backend. Speed up netCDF4, h5netcdf backends Jun 4, 2024
@dcherian
Copy link
Contributor Author

dcherian commented Jun 4, 2024

@Illviljan Benchmarks didn't change but I don't think we run I/O benchmarks in CI anyway.

@dcherian dcherian enabled auto-merge (squash) June 4, 2024 16:44
@dcherian dcherian merged commit 447e5a3 into pydata:main Jun 4, 2024
27 of 28 checks passed
@Illviljan
Copy link
Contributor

Illviljan commented Jun 4, 2024

Files with many variables are my favorites! We have a few that are related to this that doesn't look like they are skipped:

class IOReadSingleFile(IOSingleNetCDF):

This one has too few variables I guess? That could change.

class IOReadCustomEngine:

This one uses a custom (and numpy array) backend, so probably won't hit the backend specific codes I guess.

@dcherian dcherian deleted the speedup-netcdf4 branch June 4, 2024 18:23
andersy005 pushed a commit that referenced this pull request Jun 14, 2024
* Speed up netCDF4 backend.

xref #9058

Accessing `.shape` on a netCDF4 variable is ~20-40ms.
This can add up for large numbers of variables, e.g.:
#9058

We already request the shape when creating NetCDF4ArrayWrapper,
so we can reuse that.

* Update h5netcdf backend too
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-benchmark Run the ASV benchmark workflow topic-performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants