Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantization not working with NC_FORMAT_NETCDF4_CLASSIC #2441

Closed
czender opened this issue Jul 1, 2022 · 5 comments · Fixed by #2445
Closed

Quantization not working with NC_FORMAT_NETCDF4_CLASSIC #2441

czender opened this issue Jul 1, 2022 · 5 comments · Fixed by #2445

Comments

@czender
Copy link
Contributor

czender commented Jul 1, 2022

Though the glass is not empty! Quantization succeeds with NETCDF4 and Zarr.
Program nc_test4/tst_quantize.c tests quantization on a NETCDF4-format file.
That test succeeds. There is no problem with that one format. Unfortunately,
NC_FORMAT_NETCDF4 appears to be the only POSIX format that succeeds with the test.
tst_quantize fails with NC_FORMAT_NETCDF4_CLASSIC and with NC_FORMAT_CLASSIC.
I have not tried with 64BIT or 64BIT_DATA, though I suspect those too fail.
The failures can be seen by changing tst_quantize.c to use a different mode flag,
e.g., NC_NETCDF4|NC_CLASSIC_MODEL|NC_CLOBBER, to nc_create() near line 110.

NCO invocations of nc_def_var_quantize() succeed or fail in the same way
that tst_quantize.c succeeds or fails depending on the file format.
Calling nc_def_var_quantize() on a netCDF file with a format other than
NC_FORMAT_NETCDF4 leads to a state that causes nc_enddef() to fail.
In tst_quantize.c this is near line 162.
The nc_enddef() return code is -38 = NC_ENOTINDEFINE = "NetCDF: Operation
not allowed in data mode". Perhaps nc_def_var_quantize() leaves certain
file formats in data mode, rather than in define mode, which causes
nc_enddef() to fail? I looked and failed to spot any obvious bugs.

These problems occur both in 4.9.0 and in today's main trunk.
I have confirmed this behavior on Linux x86_64 with GCC and on MacOS Mac silicon with Clang.

@edwardhartnett
Copy link
Contributor

Quantization is not expected to work on classic formats (at this time). The problem is that classic formats use a totally different code base than netCDF-4, and I personally am not at all familiar with that code. (Glenn Davis wrote it, Russ maintained it, and I think @DennisHeimbigner has the greatest familiarity with it now, but it's code that is pretty rarely changed.

And quantization with classic formats is much less useful, because there is no compression anyway. So it does not speed up I/O nor save disk space. Am I missing something here?

So right now we have quantization built into a netCDF-4 function which is the one that does all the data conversions. (For example if you write ints to an NC_FLOAT var, the ints have to be converted into floats.) Zarr piggy-backs on this because it uses the same function for data conversion.

However, it should be working for NC_FORMAT_NETCDF4_CLASSIC. That's just a dumb bug probably, and my fault for not testing that. I will take a look at why that is happening.

So I would change this issue title to "Quantization not working with NC_FORMAT_NETCDF4_CLASSIC", since classic formats were never intended to be working with quantization.

@czender czender changed the title 4.9.X quantization fails with multiple netCDF formats Quantization not working with NC_FORMAT_NETCDF4_CLASSIC Jul 1, 2022
@czender
Copy link
Contributor Author

czender commented Jul 1, 2022

Done. I agree there is no real world use-case for quantization in CLASSIC format. NC_FORMAT_NETCDF4_CLASSIC is very commonly used for compression (CMIP6 requires it), so we were a bit careless in not testing quantization with that format.

@edwardhartnett
Copy link
Contributor

image

@edwardhartnett
Copy link
Contributor

edwardhartnett commented Jul 2, 2022

OK, the problem is in hdf5attr.c:

        /* If this is a new att, require define mode. */
        if (!(h5->flags & NC_INDEF))
        {
            if (h5->cmode & NC_CLASSIC_MODEL)
                return NC_ENOTINDEFINE;
            if ((retval = NC4_redef(ncid)))
                BAIL(retval);
        }

This happens when we are writing the quantization attribute. I'm pondering how best to fix this...

OK, the real problem is here:

int
nc4_enddef_netcdf4_file(NC_FILE_INFO_T *h5)
{
    assert(h5);
    LOG((3, "%s", __func__));

    /* If we're not in define mode, return an error. */
    if (!(h5->flags & NC_INDEF))
        return NC_ENOTINDEFINE;

    /* Turn define mode off. */
    h5->flags ^= NC_INDEF;

    /* Redef mode needs to be tracked separately for nc_abort. */
    h5->redef = NC_FALSE;

    return sync_netcdf4_file(h5);
}

The variable defines, and thus the attempt to write the quantize attribute, is happening in the final statement, in the call to sync_netcdf4_file(), and by this time the flag has already been changed to be out of define mode.

Instead, the sync should happen before the flag is changed.

@edwardhartnett
Copy link
Contributor

As I tried various things and reviewed this code, I saw that I had faced a similar problem some time ago when optimizing the storage of dimids to speed opening of files (and avoid the dreaded performance problems with dimension scale searches).

The answer is not to attempt to use netCDF functions to write the attributes, but to use HDF5 directly. This avoids all the complicated checks for enddef that are done in order to exactly mimic the netcdf-3 enddef error handling.

I have a PR up now, running through CI. We'll see...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants