-
-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] H5D_chunk_iter_op_t exposes size
(nbytes
) as uint32_t
rather than hsize_t
#2056
Comments
Hi @fortnern, @derobins, and @gheber . If you have a second, I would like to hear your opinion on this issue before I try to backport my proposed fix in #2074 to 1.12 and integrate this into my Basically I think the type for the Currently, we have the following in H5Dpublic.h. typedef int (*H5D_chunk_iter_op_t)(
const hsize_t *offset,
uint32_t filter_mask,
haddr_t addr,
uint32_t size,
void *op_data
); H5_DLL herr_t H5Dget_chunk_info(
hid_t dset_id,
hid_t fspace_id,
hsize_t chk_idx,
hsize_t *offset,
unsigned *filter_mask,
haddr_t *addr,
hsize_t *size
); I think we should change the definition of typedef int (*H5D_chunk_iter_op_t)(
const hsize_t *offset,
unsigned filter_mask,
haddr_t addr,
hsize_t size,
void *op_data
); I find this particularly problematic since I do not think we should expose an API where Please let me know if you agree. If so, I will then finish my current pull request series on this matter across 1.13, 1.12, and 1.10. |
The documented upper limit for an HDF5 chunk size is 4 Gb. This fits neatly within uint32. Also, the purpose of chunking is to break up a dataset into small pieces for access performance. I can not conceive of a valid purpose for a chunk size exceeding this limit. In my experience, most valid chunk sizing is between 10^3 and 10^8 bytes, far below 4 x 10^9. Therefore, I suggest that exposed uint32's be left alone, unless I am missing something. |
@Dave-Allured , it's not clear to me that that would always be the case that chunks will be limited to 4 GB. It was suggested on the forums that perhaps contiguous datasets should be a special case of chunked datasets with only a single chunk: https://forum.hdfgroup.org/t/what-do-you-want-to-see-in-hdf5-2-0/10003/19?u=kittisopikulm . In that case, we would want chunks as large as datasets can be now. The bug report here is not exactly about that. I'm not suggesting that we change the internal chunk size type here. The bug I am reporting here is about consistency across the public API. The older API functions that provide size information on chunks all use Having the public API being able to accommodate sizes up to 64 bits would allow an internal move to 64-bit sizes in the future even though that is currently not possible. For now, the API should at least be consistent. |
There's not much we can do about H5Dget_chunk_info() now, but we shouldn't be propagating a lie about acceptable parameter values across the API. Using a 64-bit parameter when there are hard 32-bit limits in the library is misleading. The real bug is that H5Dget_chunk_info() was released with an incorrect parameter type. |
If HDF Group says that unit32 was unintentional, then I can not object to changing H5Dget_chunk_info() for constency. @mkitti, thanks for working on it. |
@derobins What the current API of As far as I know, the main interface to set chunk sizes is via
The corresponding Changing Since That said I will update #1968 with the 32-bit version of |
The lesser consistency issue in the public API is between We should pick either |
I' m reopening this since we are moving forward with #2074 . I will also use this to track the backport to 1.12. |
Describe the bug
H5D_chunk_iter_op_t
exposes the argumentsize
(formerlynbytes
) asuint32_t
rather thanhsize_t
.Expected behavior
The type of
size
should behsize_t
as inH5Dget_chunk_info
.Platform (please complete the following information)
Additional context
nbytes
was renamed tosize
in #1969The text was updated successfully, but these errors were encountered: