-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hdf5] Reading with BytePointer stops on 0 values (zeros) #1311
Comments
That sounds like an issue with the API of HDF5? I mean, a different function gets called, right? |
I'm having trouble tracking it down. That read call is:
So, they are calling the same method since they both extend Pointer. I'm not sure what lives on the other side of that native call. |
Oh , actually, there is also a
So that one is a different method. All other pointers would be calling the Pointer one. |
Would the |
Ah, yeah. In hdf5's
and
So when we use a BytePointer, we are accidentally calling the Note this would also be an issue for the cooresponding |
Btw, until this gets fixed, for writing/reading all you have to do is cast to Pointer to get it to call the correct method, like this:
No need to muck around with ShortPointer or anything. |
Yes, that's how we need to do it. What kind of fix do you have in mind? |
So I was thinking of re-mapping the |
Yeah, it should be possible see https://github.com/bytedeco/javacpp/wiki/Mapping-Recipes#specifying-names-to-use-in-java, but that's a problem with the C++ API. If you call |
In my opinion, that's something you should be reporting upstream: |
Good thing we have a good workaround, I could see that one taking a while to get changed 😆. |
I reported a bug there: https://jira.hdfgroup.org/browse/HDFVIEW-297 |
@calvertdw why not create a Github issue here: https://github.com/HDFGroup/hdf5/issues ? |
Great point. The other was getting no activity. Done! |
We can add those as helper methods if you want. Sounds good? |
Are the strings encoded according to Java's modified UTF-8 encoding? https://docs.oracle.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8 If so, isn't the correct way to read this via the JNI function That's how |
@mkitti You are misundertanding the issue I was having. I was trying to read raw data (encoded JPEG images) from an HDF5 file. I had a The problem is that to read/write literally just raw data, HDF5 is making that risky by inviting the opportunity to accidentally call a version of |
I think HDFGroup/hdf5#3083 still fixes the situation. Even if you read it as a |
@mkitti Sounds like a fix to me. When does it get released? |
Well it got merged into the main development branch, so now it is just a matter of the HDF5 release schedule in their README. |
It's still not the fix I was hoping for, because the correct way to read/write bytes is to use the standard |
Even if keeping using the string version, I think users would still sometimes get errors, from the changlog of HDFGroup/hdf5#3083:
A JPEG image is variable length, after all. |
It still sounds like an issue with HDF5 to me. The read() function has no idea how much memory was allocated in the pointer it was given, so that's a safety issue. There should be a size parameter somewhere to indicate the maximum size. That's not an issue with std::string since that can allocate as much memory as we need. |
How so? Its length is not determined by some token in the byte sequence. When you write the bytes of the image, you tell HDF5 how many bytes you want to write. |
The memory space argument is meant to indicate how many bytes have been allocated. See the docs for the C function https://docs.hdfgroup.org/hdf5/v1_14/group___h5_d.html#ga8287d5a7be7b8e55ffeff68f7d26811c |
Ok, so inversely, why do we need to that memory space argument for std::string?? |
You do not need it in C++. It defaults to DataSpace::ALL. That appears to get translated to this via JavaCPP: public void read(@StdString @ByRef However, you can specify a file dataspace if you only want to read some of the bytes and copy it into a specific location of a buffer via memory space. |
BTW, does it make sense to map |
Perhaps |
There isn't anything interesting about |
It seems like JavaCPP should have a new type to match with a |
Java has
|
No, it's not difficult. I've added support for that in commit bytedeco/javacpp@7227ec6, so we can add something like that to HDF5's presets: infoMap.put(new Info("std::basic_string<char>", "std::string", "H5std_string").pointerTypes("H5std_string").define()); But it's not exactly a nice (efficient) interface for a string and that replaces all instances of
We can access |
That looks great! That sounds like it will absolutely solve this issue, since this is about saving raw data with HDF5 and avoiding accidental calls to the |
Looking at this again, it seems like everything works like we want when casting explicitly to |
Yes. Thank you!! |
When 0s are in the data, I find I must use a ShortPointer to get the full data for some reason. I'm not sure why. The top test case passes, native bytes and be written and read correctly, the second test fails.
The text was updated successfully, but these errors were encountered: