Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cce H5 file type #5963

Merged
merged 4 commits into from
May 11, 2024
Merged

Add Cce H5 file type #5963

merged 4 commits into from
May 11, 2024

Conversation

knelli2
Copy link
Contributor

@knelli2 knelli2 commented May 3, 2024

Proposed changes

This new H5 file type will help streamline the usage of waveform output from the CCE executable within the sxs and scri python packages.

These new Cce subfile is basically just a group that holds datasets for the bondi variables. The group and the datasets have attributes that the python packages will look for.

Upgrade instructions

Code review checklist

  • The code is documented and the documentation renders correctly. Run
    make doc to generate the documentation locally into BUILD_DIR/docs/html.
    Then open index.html.
  • The code follows the stylistic and code quality guidelines listed in the
    code review guide.
  • The PR lists upgrade instructions and is labeled bugfix or
    new feature if appropriate.

Further comments

@knelli2 knelli2 requested a review from nilsdeppe May 3, 2024 01:20
nilsdeppe
nilsdeppe previously approved these changes May 3, 2024
@knelli2
Copy link
Contributor Author

knelli2 commented May 3, 2024

Hmm the MacOS CI failed for the test I added. Not really sure how to debug without a Mac...

Copy link
Member

@wthrowe wthrowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test segfaulted on the macOS build. I didn't catch anything that would cause that reading the code, but it's probably a real failure.

src/IO/H5/Cce.cpp Outdated Show resolved Hide resolved
src/IO/H5/Cce.cpp Show resolved Hide resolved
src/IO/H5/Cce.cpp Outdated Show resolved Hide resolved
tests/Unit/IO/H5/Test_Cce.cpp Outdated Show resolved Hide resolved
tests/Unit/IO/H5/Test_Cce.cpp Outdated Show resolved Hide resolved
tests/Unit/IO/H5/Test_Cce.cpp Outdated Show resolved Hide resolved
@nilsvu
Copy link
Member

nilsvu commented May 3, 2024

I can look at the segfault at macos

@knelli2
Copy link
Contributor Author

knelli2 commented May 6, 2024

Fixup doesn't include anything for MacOS failure.

@wthrowe
Copy link
Member

wthrowe commented May 8, 2024

Fixups look good. Just need to figure out the Mac thing.

@nilsvu
Copy link
Member

nilsvu commented May 8, 2024

Here's a fix for the macOS issue:

diff --git a/src/IO/H5/Helpers.cpp b/src/IO/H5/Helpers.cpp
index 6a6b5b6ca5..cf8001cbf2 100644
--- a/src/IO/H5/Helpers.cpp
+++ b/src/IO/H5/Helpers.cpp
@@ -473,8 +473,15 @@ void write_to_attribute(const hid_t location_id, const std::string& name,
   const hid_t att_id = H5Acreate2(location_id, name.c_str(), h5_type<Type>(),
                                   space_id, h5p_default(), h5p_default());
   CHECK_H5(att_id, "Failed to create attribute '" << name << "'");
-  CHECK_H5(H5Awrite(att_id, h5_type<Type>(), static_cast<const void*>(&value)),
-           "Failed to write value: " << value);
+  if constexpr (std::is_same_v<Type, std::string>) {
+    const auto cstr = value.c_str();
+    CHECK_H5(H5Awrite(att_id, h5_type<Type>(), &cstr),
+             "Failed to write value: " << value);
+  } else {
+    CHECK_H5(
+        H5Awrite(att_id, h5_type<Type>(), static_cast<const void*>(&value)),
+        "Failed to write value: " << value);
+  }
   CHECK_H5(H5Aclose(att_id), "Failed to close attribute '" << name << "'");
   CHECK_H5(H5Sclose(space_id), "Unable to close dataspace");
 }

Note that we don't currently store any other single string attributes in H5 files, maybe because of issues like this. Even the input file is stores as single-element vector, which seems wrong. Maybe with this fix the input file can be stored as a plain string?

@nilsvu
Copy link
Member

nilsvu commented May 8, 2024

Can you also add Python bindings for the CCE file type? I expect we'll want to access those in Python quite a lot.

@knelli2
Copy link
Contributor Author

knelli2 commented May 9, 2024

  1. Rebased
  2. Added new commit for writing string attributes using C-style string (did not change current uses like in input_source. That can be done in a different PR)
  3. Squashed fixup into "Add Cce" commit
  4. Added new commit that binds the Cce h5 file in python

nilsdeppe
nilsdeppe previously approved these changes May 9, 2024
Comment on lines 476 to 480
if constexpr (std::is_same_v<Type, std::string>) {
const auto cstr = value.c_str();
CHECK_H5(H5Awrite(att_id, h5_type<Type>(), &cstr),
"Failed to write value: " << value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused. This seems to work, but every example I can find online passes this function a char *, while this passes a char **. What's going on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried removing the & and got a seg fault

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the form that seems to work. I'm just wondering if you know of any documentation that actually explains how to use the function or an explanation of why every example I can find is apparently wrong. SpEC does the same thing as this, which makes me happier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't know of any documentation for this. I just copied this from Nils Vu's patch he sent, and didn't question it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any explicit HDF5 documentation on this. A couple examples I found online actually don't write the attribute out because they use H5S_NULL instead of H5S_SCALAR. Below is an example code if you want to try digging into this. My best guess is that because the H5S_SCALAR is a "scalar string" that you must pass a pointer to the string, i.e. a pointer to the pointer of characters. Here's the H5Awrite documentation https://docs.hdfgroup.org/hdf5/develop/group___h5_a.html#title37 with the void * buf being what I interpret as the "pointer to scalar" which if the scalar is of type char* then I guess you'd want a char**. In the below example, if I switch from H5S_SCALAR to H5S_NULL the code runs but h5dump ./Attributes.h5 shows the attribute isn't written. If I try H5S_SIMPLE everything explodes with cryptic HDF5 internals errors that I don't understand. So what I can tell you is that empirically, this is the right way to write a null-terminated variable-length string as an HDF5 attribute. It may be different for datasets, I don't remember. I do remember years ago when writing the helpers that getting strings working was a pain because online examples were essentially useless...

 #include <hdf5.h>
 #include <string>

 #define H5FILE_NAME "Attributes.h5"
 #define ATTR_NAME "VarLenAttr" /* Name of the string attribute */

 #define DATASET_RANK  1   /* Rank and size of the dataset  */
 #define DATASET_SIZE  7

 int main() {
   hid_t file, dataset, attr;       /* File and dataset identifiers */
   hid_t dataset_space, attr_space; /* Dataset's and Attribute's dataspace id */
   hid_t attr_type;                 /* Attribute type */
   herr_t status;                   /* Return value */

   hsize_t fdim[] = {DATASET_SIZE};
   std::string data{"ASD"};

   file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

   dataset_space = H5Screate(H5S_SIMPLE);
   status = H5Sset_extent_simple(dataset_space, DATASET_RANK, fdim, NULL);

   dataset = H5Dcreate2(file, "my_dataset", H5T_NATIVE_INT, dataset_space,
                        H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

   attr_space = H5Screate(H5S_SCALAR);
   attr_type = H5Tcopy(H5T_C_S1);
   status = H5Tset_size(attr_type, H5T_VARIABLE);
   status = H5Tset_strpad(attr_type, H5T_STR_NULLTERM);
   status = H5Tset_cset(attr_type, H5T_CSET_ASCII);

   attr = H5Acreate2(dataset, ATTR_NAME, attr_type, attr_space, H5P_DEFAULT,
                     H5P_DEFAULT);

   const char* attr_string = data.c_str();
   status = H5Awrite(attr, attr_type, &attr_string);
   H5Aclose(attr);
   H5Tclose(attr_type);
   H5Sclose(attr_space);

   H5Fclose(file);
 }

Is this enough for you to be okay with this part of the diff or is there some explicit change request you still have on this part?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the code. Yes, I don't think anything needs to change for this. I just spent a lot of time confused about how the interface worked and hoped someone else knew better than me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're welcome! Okay, sonuds good! I tried digging into the HDF5 internals but got stuck where they try to use structs with function pointer members as a way of mocking classes: https://github.com/HDFGroup/hdf5/blob/18321dee40b21e19c95bf0ef2f61e3ec71233fe9/src/H5VLconnector.h#L837 So I never got to the final "write the string to disk" stage since I have no idea where the "constructor" for this class is :(

"Failed to write value: " << value);
if constexpr (std::is_same_v<Type, std::string>) {
const auto cstr = value.c_str();
CHECK_H5(H5Awrite(att_id, h5_type<Type>(), &cstr),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the h5_type docs: "\note For strings, the returned type must be released with H5Tclose"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it isn't always done in the Helpers.cpp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, read_value_attribute also gets it wrong. The other calls aren't strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will is correct, this is a latent bug in the other code as well. It only matters for strings since we set some properties on the string to make it nicer to deal with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so shall I change it here, or do a follow-up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix it in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squashed it in to the "Write H5" commit since it was only a few lines

This was found to cause issues on MacOS when trying to write a
single string to an attribute.
@nilsdeppe nilsdeppe merged commit a85cb65 into sxs-collaboration:develop May 11, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants