-
-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove cached datatype conversion path table entries on file close #3942
Remove cached datatype conversion path table entries on file close #3942
Conversation
a20ea15
to
96d6c14
Compare
When performing datatype conversions during I/O, the library checks to see whether it can re-use a cached datatype conversion pathway by performing comparisons between the source and destination datatypes of the current operation and the source and destination datatypes associated with each cached datatype conversion pathway. For variable-length and reference datatypes, a comparison is made between the VOL object for the file associated with these datatypes, which may change as a file is closed and reopened. In workflows involving a loop that opens a file, performs I/O on an object with a variable-length or reference datatype and then closes the file, this can lead to constant memory usage growth as the library compares the file VOL objects between the datatypes as different and adds a new cached conversion pathway entry on each iteration during I/O. This is now fixed by clearing out any cached conversion pathway entries for variable-length or reference datatypes associated with a particular file when that file is closed.
This PR adds a simple fix for the problem described in #1256. The main problem behind the issue is that variable-length and reference datatypes are compared according to their associated file VOL object, which may change as a file is closed and reopened. When attempting to reuse type conversion pathways in an application that has a file open -> I/O -> file close loop, the library will then compare datatypes in the current conversion operation as different from the datatypes in the cached pathway table, causing the caching table to continually grow as new entries are added until the H5T package is torn down at library termination. Further, since variable-length and reference datatypes hold a reference to their associated file's VOL object after #274, the memory for the VOL object will also be held around until the library terminates. Together, this can eventually lead to a drastic explosion of memory usage. Several approaches for fixing this were proposed, including having a special conversion path table just for variable-length and reference datatypes, but this fix seems the most straightforward with existing library infrastructure. The downside is that it adds a scan of the cached conversion pathway table on every file close operation. If this still seems workable, I'll add a RELEASE note entry for the fix; Otherwise, we should discuss a better solution. |
Example program that demonstrates the issue: #include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hdf5.h"
int
main(int argc, char **argv)
{
const char *buf[] = {"attr_value"};
hid_t file_id = H5I_INVALID_HID;
hid_t attr_id = H5I_INVALID_HID;
hid_t str_type = H5I_INVALID_HID;
hid_t space_id = H5I_INVALID_HID;
int *irbuf = NULL;
char **rbuf = NULL;
printf("Test memory leak\n");
file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
if (file_id == H5I_INVALID_HID) {
printf("Couldn't create file\n");
exit(EXIT_FAILURE);
}
str_type = H5Tcopy(H5T_C_S1);
if (str_type == H5I_INVALID_HID) {
printf("Couldn't create string type\n");
exit(EXIT_FAILURE);
}
if (H5Tset_size(str_type, H5T_VARIABLE) < 0) {
printf("Couldn't set string size\n");
exit(EXIT_FAILURE);
}
space_id = H5Screate(H5S_SCALAR);
if (space_id == H5I_INVALID_HID) {
printf("Couldn't create dataspace\n");
exit(EXIT_FAILURE);
}
attr_id = H5Acreate2(file_id, "attribute", str_type, space_id, H5P_DEFAULT, H5P_DEFAULT);
if (attr_id == H5I_INVALID_HID) {
printf("Couldn't create attribute\n");
exit(EXIT_FAILURE);
}
if (H5Awrite(attr_id, str_type, buf) < 0) {
printf("Couldn't write to attribute\n");
exit(EXIT_FAILURE);
}
H5Aclose(attr_id);
H5Fclose(file_id);
irbuf = malloc(1000 * 1000 * sizeof(int));
rbuf = malloc(sizeof(char *));
/* for (size_t i = 0; i < 10; i++) { */
while (true) {
file_id = H5Fopen("file.h5", H5F_ACC_RDONLY, H5P_DEFAULT);
if (file_id == H5I_INVALID_HID) {
printf("Couldn't open file\n");
exit(EXIT_FAILURE);
}
attr_id = H5Aopen(file_id, "attribute", H5P_DEFAULT);
if (attr_id == H5I_INVALID_HID) {
printf("Couldn't open attribute\n");
exit(EXIT_FAILURE);
}
if (H5Aread(attr_id, str_type, rbuf) < 0) {
printf("Couldn't read from attribute\n");
exit(EXIT_FAILURE);
}
H5Treclaim(str_type, space_id, H5P_DEFAULT, rbuf);
H5Aclose(attr_id);
H5Fclose(file_id);
}
free(irbuf);
free(rbuf);
H5Tclose(str_type);
H5Sclose(space_id);
return 0;
} |
src/H5T.c
Outdated
@@ -2704,6 +2704,8 @@ H5T__unregister(H5T_pers_t pers, const char *name, H5T_t *src, H5T_t *dst, H5T_c | |||
continue; | |||
if (dst && dst->shared->type != soft->dst) | |||
continue; | |||
if (owned_vol_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to remove the vlen conversion from the soft conversions table here, so always continue on if the owned_vol_obj
parameter is specified. Ideally, the logic for removing the soft conversion functions would be separate from removing the cached conversion pathway table entries, but this seems minimally invasive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire for loop can be skipped if owned_vol_obj is non-NULL. (So you could add that check to the surrounding if statement, which will reduce the performance impact a little)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true; will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm clicking 'approve', but believe that if you make the two small changes I've suggested, it would be better. :-)
src/H5T.c
Outdated
@@ -2704,6 +2704,8 @@ H5T__unregister(H5T_pers_t pers, const char *name, H5T_t *src, H5T_t *dst, H5T_c | |||
continue; | |||
if (dst && dst->shared->type != soft->dst) | |||
continue; | |||
if (owned_vol_obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire for loop can be skipped if owned_vol_obj is non-NULL. (So you could add that check to the surrounding if statement, which will reduce the performance impact a little)
src/H5T.c
Outdated
path = H5T_g.path[i]; | ||
assert(path); | ||
|
||
nomatch = ((H5T_PERS_SOFT == pers && path->is_hard) || (H5T_PERS_HARD == pers && !path->is_hard)) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has become increasingly unwieldy - a small "match" routine is justified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it's become unwieldy, though is the point of a "match" routine just to move the visual ugliness out of this routine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly, and possibly to re-use
c43774f
to
c063c72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super, thanks for updating! :-)
311ca76
to
0b2bf8a
Compare
Even with this merged, is it worth sorting out how to manually clear this memory in h5py? |
I think I need to better understand the proxy buffering mechanism in h5py first, but I believe that this is still a problem (at least for non-string variable-length types) since I can't see anywhere else in the h5py codebase that |
…DFGroup#3942) Remove cached datatype conversion path table entries on file close When performing datatype conversions during I/O, the library checks to see whether it can re-use a cached datatype conversion pathway by performing comparisons between the source and destination datatypes of the current operation and the source and destination datatypes associated with each cached datatype conversion pathway. For variable-length and reference datatypes, a comparison is made between the VOL object for the file associated with these datatypes, which may change as a file is closed and reopened. In workflows involving a loop that opens a file, performs I/O on an object with a variable-length or reference datatype and then closes the file, this can lead to constant memory usage growth as the library compares the file VOL objects between the datatypes as different and adds a new cached conversion pathway entry on each iteration during I/O. This is now fixed by clearing out any cached conversion pathway entries for variable-length or reference datatypes associated with a particular file when that file is closed.
* Update upload- artifact to match download version (#3929) * Reorg and update options for doc and cmake config (#3934) * Add binary build for linux S3 (#3936) * Clean up Doxygen for szip functions and constants (#3943) * Replace off_t with HDoff_t internally (#3944) off_t is a 32-bit signed value on Windows, so we should use HDoff_t (which is __int64 on Windows) internally instead. Also defines HDftell on Windows to be _ftelli64(). * Fix chid_t to hid_t (#3948) * Fortran API work. (#3941) * - Added Fortran APIs: H5FGET_INTENT_F, H5SSELECT_ITER_CREATE_F, H5SSEL_ITER_GET_SEQ_LIST_F, H5SSELECT_ITER_CLOSE_F, H5S_mp_H5SSELECT_ITER_RESET_F - Added Fortran Parameters: H5S_SEL_ITER_GET_SEQ_LIST_SORTED_F, H5S_SEL_ITER_SHARE_WITH_DATASPACE_F - Added tests for new APIs - Removed H5F C wrapper stubs - Documentation misc. cleanup. * Add the user test program in HDFFV-9174 for committed types. (#3937) Add the user test program for committed types in HDFFV-9174 * Remove cached datatype conversion path table entries on file close (#3942) * fixed BIND name (#3957) * update H5Ssel_iter_reset_f test * Change 'extensible' to 'fixed' in H5FA code (#3964) * RF: move codespell configuration into .codespellrc so could be used locally as well (#3958) * Add RELEASE.txt note for the fix for issue #1256 (#3955) * Fix doxygen errors (#3962) * Add API support for Fortran MPI_F08 module definitions. (#3959) * revert to using c-stub for _F08 MPI APIs * use mpi compiler wrappers for cmake and nvhpc * Added a GitHub Codespaces configuration. (#3966) * Fixed XL and gfortran errors (#3968) * h5 compiler wrappers now pass all arguments passed to it to the compile line (#3954) * The issue was that the "allargs" variable was not being used in the final command of the compiler wrapper. Any entries containing an escaped quote (\", \') or other non-matching argument (*) would not be passed to the compile line. I have fixed this problem by ensuring all arguments passed to the compiler wrapper are now included in the compile line. * Add binary testing to CI testing (#3971) * Replace 'T2' with ' ' to avoid failure to match expected output due to (#3975) * Clarify vlen string datatype message (#3950) * append '-WF,' when passing C preprocessor directives to the xlf compiler (#3976) * Create CITATION.cff (#3927) Add citation source based on http://web.archive.org/web/20230610185232/https://portal.hdfgroup.org/display/knowledge/How+do+I+properly+cite+HDF5%The space difference in the Fortran examples must be fixed to match the expected output for compression filter examples. * corrected warning: implicit conversion changes signedness (#3982) * Skip mac bintest until more reliable (#3983) * Make platform specific test presets for windows and macs (#3988) * chore: fix typo (#3989) * Add a missing left parenthesis in RELEASE.txt. (#3990) * Remove ADB signature from RELEASE.txt. (#3986) * Bump the github-actions group with 6 updates (#3981) * Sync API tests with vol-tests (#3940) * Fix for github issue #2414: segfault when copying dataset with attrib… (#3967) * Fix for github issue #2414: segfault when copying dataset with attributes. This also fixes github issue #3241: segfault when copying dataset. Need to set the location via H5T_set_loc() of the src datatype when copying dense attributes. Otherwise the vlen callbacks are not set up therefore causing seg fault when doing H5T_convert() -> H5T__conv_vlen(). * Fix broken links caused by examples relocation. (#3995) * Add abi-complience check and upload to releases (#3996) * Fix h5watch test failures to ignore system warnings on ppc64le. (#3997) * Remove oneapi/clang compiler printf() type warning. (#3994) * Updated information about obtaining the HDF5 source code to use the repos. (#3972) * Fix overwritten preset names (#4000) * Fix incompatible pointer type warnings in object reference examples (#3999) * Fix build issue and some warnings in H5_api_dataset_test.c (#3998) * Modern C++ dtor declarations (#1830) * C++ dtor modernization - Replaced a bunch of empty dtors with `= default` - Removed deprecated `throw()`. In C++11, dtors are `noexcept` by default. * remove incorrect check for environ (#4002) * Add a missing file into Makefile.am for MinGW Autotools build error. (#4004) * Issue #1824: Replaced most remaining sprintf with safer snprint (#4003) * Add hl and cpp ABI reports to daily build (#4006) * Don't add files and directories with names that begin with ., or that match *autom4te* to release tar & zip files. (#4009) * Fix some output issues with ph5diff (#4008) * Update install texts (#4010) * Add C in project line for CMake to fix #4012. (#4014) * separate out individual checks for string removal (#4015) * Add compound subset ops on attributes to API tests (#4005) ---------
When performing datatype conversions during I/O, the library checks to see whether it can re-use a cached datatype conversion pathway by performing comparisons between the source and destination datatypes of the current operation and the source and destination datatypes associated with each cached datatype conversion pathway. For variable-length and reference datatypes, a comparison is made between the VOL object for the file associated with these datatypes, which may change as a file is closed and reopened. In workflows involving a loop that opens a file, performs I/O on an object with a variable-length or reference datatype and then closes the file, this can lead to constant memory usage growth as the library compares the file VOL objects between the datatypes as different and adds a new cached conversion pathway entry on each iteration during I/O. This is now fixed by clearing out any cached conversion pathway entries for variable-length or reference datatypes associated with a particular file when that file is closed.