Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF error when unlimited dimension and a variable share the same name #975

Closed
oceandatalab opened this issue Oct 11, 2019 · 7 comments
Closed

Comments

@oceandatalab
Copy link

oceandatalab commented Oct 11, 2019

It seems there is an issue when a variable shares the same name as an unlimited dimension

Here a a simple test script which creates:

  • an unlimited dimension called "time"
  • a fixed-size dimension called "time_subsample"
  • a variable whose name is passed to the script with the dimension "time_subsample"
  • a variable named "dummy" with the dimension "time"
# test_netcdf_bug.py
import netCDF4
import sys

time_var_name = sys.argv[1]

nc_file = netCDF4.Dataset('test.nc', mode='w', format='NETCDF4', clobber=True)
nc_file.createDimension('time', None)  # unlimited
nc_file.createDimension('time_subsample', 50)

time_var = nc_file.createVariable(time_var_name, 'f8', ('time_subsample',))
time_var[:] = [i for i in range(0, 50)]

dummy_var = nc_file.createVariable('dummy', 'int', ('time', ))
dummy_var[:] = [i for i in range(0, 500)]

nc_file.close()

Calling the script with any name other than "time" works fine:

python test_netcdf_bug.py anything
python test_netcdf_bug.py time_subsample
python test_netcdf_bug.py odl

But if you create a variable named "time", you get an HDF error:

python test_netcdf_bug.py time
Traceback (most recent call last):
  File "test_netcdf_bug.py", line 16, in <module>
    nc_file.close()
  File "netCDF4/_netCDF4.pyx", line 2485, in netCDF4._netCDF4.Dataset.close
  File "netCDF4/_netCDF4.pyx", line 2449, in netCDF4._netCDF4.Dataset._close
  File "netCDF4/_netCDF4.pyx", line 1887, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: HDF error

Creating another unlimited dimension and assigning it to the time variable generates no error:

# test2_netcdf_bug.py
import netCDF4

nc_file = netCDF4.Dataset('test.nc', mode='w', format='NETCDF4', clobber=True)
nc_file.createDimension('time', None)  # unlimited
nc_file.createDimension('nolimit', None)  # unlimited
nc_file.createDimension('time_subsample', 50)

time_var = nc_file.createVariable('time', 'f8', ('nolimit',))
time_var[:] = [i for i in range(0, 50)]

dummy_var = nc_file.createVariable('dummy', 'int', ('time', ))
dummy_var[:] = [i for i in range(0, 500)]

nc_file.close()
python test2_netcdf_bug.py

So it seems impossible to have an unlimited dimension and a variable with the same name if the variable does not have an unlimited dimension too.

I also tried to replicate the issue with the C++ bindings but it managed to create the NetCDF file just fine.

Tested on Archlinux with the following package/library versions:

>>> import sys
>>> import netCDF4
>>> sys.version
'3.7.4 (default, Jul 16 2019, 07:12:58) \n[GCC 9.1.0]'
>>> netCDF4.__version__
'1.5.2'
>>> netCDF4.__hdf5libversion__
'1.10.2'
>>> netCDF4.__netcdf4libversion__
'4.6.3
@jswhit
Copy link
Collaborator

jswhit commented Oct 11, 2019

Confirmed. I don't yet understand why it doesn't happen with an equivalent C or C++ program.

@jswhit
Copy link
Collaborator

jswhit commented Oct 11, 2019

at the risk of stating the obvious, it's a bad idea to have a variable with the same name as a dimension that is not a coordinate variable (that describes the dimension's values). Although it's not a good practice, I don't think it should cause the library to crash.

@jswhit
Copy link
Collaborator

jswhit commented Oct 14, 2019

Here's a c program that triggers the error for me. Let me know if this reproduces your error and I'll create a netcdf-c issue

#include <netcdf.h>
#include <stdio.h>
int main() {
   int dataset_id, timesubset_id, time_id, timevar_id, dummyvar_id, ierr;
   size_t start[1] = {0};
   size_t count[1] = {1};
   double data[1] = {2};
   nc_create("test.nc", NC_CLOBBER | NC_NETCDF4, &dataset_id);
   nc_def_dim(dataset_id, "time", NC_UNLIMITED, &time_id);
   nc_def_dim(dataset_id, "time_subset", 50, &timesubset_id);
   // this works
   //nc_def_var(dataset_id, "time", NC_DOUBLE, 1, &time_id, &timevar_id);
   //nc_def_var(dataset_id, "dummy", NC_DOUBLE, 1, &timesubset_id, &dummyvar_id);
   // this produces ierr=-101 (HDF5 error) on close
   // note: variable is called 'time', same as unlimited dimension, but
   // is defined with a different (fixed) dimension.
   nc_def_var(dataset_id, "time", NC_DOUBLE, 1, &timesubset_id, &timevar_id);
   nc_def_var(dataset_id, "dummy", NC_DOUBLE, 1, &time_id, &dummyvar_id);
   ierr=nc_put_vara(dataset_id, timevar_id, start, count, data);
   printf ( "ierr from nc_put_vara=%d\n", ierr);
   ierr=nc_put_vara(dataset_id, dummyvar_id, start, count, data);
   printf ( "ierr from nc_put_vara=%d\n", ierr);
   ierr=nc_close(dataset_id);
   printf ( "ierr from nc_close=%d\n", ierr);
}

@oceandatalab
Copy link
Author

I confirm failure at the end of the program, when nc_close is called:

gcc -lnetcdf -o test_netcdf_jswhit test_netcdf_jswhit.c
./test_netcdf_jswhit
ierr from nc_put_vara=0
ierr from nc_put_vara=0
ierr from nc_close=-101

@edhartnett
Copy link

I have just submitted a PR with a fix to netcdf-c. See Unidata/netcdf-c#1528

@oceandatalab I have long admired your sentinel data viewers on the web. Beautiful! ;-)

Keep in netCDFing!

@oceandatalab
Copy link
Author

oceandatalab commented Nov 18, 2019

Thanks a lot @edhartnett ! We will try to apply your fix as soon as possible on our dev machines.
NetCDF is at the core of our latest tool so we'll keep using it for a long time :)

EDIT: tested the fix with C and C++ snippets, then with the Python code that triggered the error in the first place, everything works as expected, thanks again!

@oceandatalab
Copy link
Author

Closing the issue since it was not caused by a bug in netcdf4-python and because the problem will be solved once the next version of netcdf-c is out.

Thanks again for your help @jswhit and @edhartnett !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants