Re-enable C API, depletion, and many tests #33

gridley · 2023-05-09T19:10:41Z

Many, but not all, of the tests work now. I've also marked a few of the tests as things we should skip for now. I'm submitting this as intermediate work on the way to getting pytest usable here again.

Also, we can now run depletion here, which is nice. Previously stuff would crash and burn on depletion problems.

Closes #26 and #32

jtramm · 2023-05-10T15:26:52Z

Thanks so much for fixing all these tests -- these are super useful fixes! Would be awesome to have most of the python tests working again. At first glance the fixes in the PR generally look good, though all the HPC systems at ANL are down this week, so I won't be able to do any testing with this until next week.

I did have a few questions though on the new Named template class:

Do we actually need to access string names on device?
If we don't need to access them, is there any harm in just leaving them as std::string, and assuming they are not copied correctly?
If we do need to access the names on device, is there any reason why the various classes that have a name field would need very different static name lengths? Why introduce an inheritance operation just to add a single field to a bunch of classes that would be ok sharing the same static length of that field? It seems like it would be more straightforward to just convert the std::string name_ fields to something like Name name_.
Adding to the above point, if name fields are needed on device, is there a reason they need to be trivially copyable? As most of the classes in OpenMC that have name_ fields are already not trivially copyable, it could make sense to make this field dynamically sized and copy it in the same manner as other dynamic fields, as this would be one less macro to worry about.

gridley · 2023-05-10T15:29:13Z

If we don't need to access them, is there any harm in just leaving them as std::string, and assuming they are not copied correctly?

Yes. std::string causes segfaults when its destructor runs if we have expanded an array of objects which contain std::string's has their destructors run. I'm not opposed to leaving pointers to names rather than statically sized names, but simply removing std::string and not having to worry about running its copy constructor when resizing arrays is pretty nice.

gridley · 2023-05-10T15:37:54Z

3. If we do need to access the names on device, is there any reason why the various classes that have a name field would need very different static name lengths? Why introduce an inheritance operation just to add a single field to a bunch of classes that would be ok sharing the same static length of that field? It seems like it would be more straightforward to just convert the std::string name_ fields to something like Name name_.

Firstly, its nice to have getters and setters that protect us in copying to a static array, so that we can either warn about truncation or change the underlying implementation of how names are stored, e.g. switch off to pointers to elsewhere.

To answer the first question, elements and isotopes can be expected to have at most 8 bytes of data, which keeps alignment of the struct as if we're adding one double precision number. It's nice to minimize how much extra space we take, so it's templated. On the other hand, thermal scattering data names are a bit longer so it makes sense to allow 16 characters. Lastly, cells and lattices are given longer names sometimes in the test suite, hence the choice of 32 characters.

Another reason that inheritance solves this problem cleanly is that we often require std::string to be used as our interchange format for character data, e.g. in the HDF5 interface. This way, if we switch off to a char array, the calls to HDF5 stuff, and std::string::operator+ still work as expected, since the name() method builds an std::string off the underlying array.

Lastly, inheritance in this case is nothing more than adding a data member. It's just more semantically meaningful.

gridley · 2023-05-10T15:39:45Z

4. s most of the classes in OpenMC that have name_ fields are already not trivially copyable

Now that std::string is removed from them, they are trivially copyable. This is why std::realloc now works with the various classes that inherit from Named now.

gridley · 2023-05-10T15:41:18Z

As for accessing the names on device, no, it's definitely not necessary. Just a plus for debugging down the road. We could replace these with a vector and call it a day just as easily.

jtramm · 2023-05-10T18:12:21Z

Ah, when I say trivially copyable, I was thinking of host -> device copying, rather than host -> host copying. In host -> device copying, we still have to manually copy the vector data individually, so most objects are not trivially copyable in that context. At some point, we should switch over to using OpenMP custom mappers so that the host -> device copying becomes trivial, but we do not utilize that feature yet.

To your last point -- why not just make the name_ field an openmc::vector<char>? As an optimization, this would also push the unused name_ field data out of the actual contiguous storage regions of the class objects, which could improve locality of accesses to their member fields.

gridley · 2023-05-10T18:17:04Z

Yeah, I agree with you on that. It's a nice micro optimization to have. I'll revise this PR to use that technique instead!

Still, vector isn't compatible with std::string out of the box, so it might make sense to have it wrapped in a class that makes it straightforward to interface with.

jtramm · 2023-05-10T18:19:48Z

Sounds good! Yeah, I can see the utility of defining a Name class as you do now and just having it be a normal Name name_; member field in the classes that use it.

gridley · 2023-05-10T18:22:28Z

So you definitely want to remove the inheritance approach? If we're going to refactor it like this, I'll probably name it TriviallyCopyableString or something like that.

gridley · 2023-06-12T22:42:49Z

Hey John, stuff should be good-to-go now! I've left it as a mixin class though, since we already were using member functions like set_name, it makes sense to do this. Otherwise I'll just be copying and pasting repeated code in various places. Inheritance in this case is nothing more than adding a data member and adding some methods.

jtramm · 2023-06-15T01:22:40Z

Sounds reasonable -- I'll take another look and will do some testing on this branch. Thanks again for putting these fixes together!

jtramm · 2023-06-28T17:10:45Z

I tried compiling with LLVM with A100 as a target, I'm getting compile time error:

clang++: error: unknown argument: '-fext-numeric-literals'

When I remove this argument from the CMakeLists.txt and try again, I get a link-time error:

[ 81%] Linking CXX shared library lib/libopenmc.so
nvlink error   : Undefined reference to 'strlen' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to 'isatty' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to 'exit' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt24__throw_out_of_range_fmtPKcz' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSo3putEc' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSo5flushEv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt16__throw_bad_castv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNKSt5ctypeIcE13_M_widen_initEv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE5rfindEcm' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt4cerr' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
clang: error: fatbinary command failed with exit code 255 (use -v to see invocation)
clang version 17.0.0 (https://github.com/llvm/llvm-project.git 091bfa76db64fbe96d0e53d99b2068cc05f6aa16)
Target: nvptx64-nvidia-cuda
Thread model: posix
InstalledDir: /soft/compilers/llvm/main-20230628/bin
clang: note: diagnostic msg: Error generating preprocessed source(s).
/soft/compilers/llvm/master-nightly/bin/clang-linker-wrapper: error: 'clang' failed
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/libopenmc.dir/build.make:106: lib/libopenmc.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:226: CMakeFiles/libopenmc.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

gridley · 2023-06-28T17:19:46Z

Oh yeah, fext-numeric-literals is a gcc thing. Lets you write complex literals very easily, but we should move to the more portable approach of using namespace std::complex_literals;. Will include a commit for that.

On the linking, looks like gcc transitively is including a system header that LLVM is not. Thanks for pointing this out, will also test with LLVM. Or maybe it's trying to generate device code containing strlen. Hm. Will look into it.

jtramm · 2023-06-28T17:22:37Z

I re-ran without the unity build to perhaps narrow down the problem, it gives:

[  8%] Linking CXX shared library lib/libopenmc.so
nvlink error   : Undefined reference to '_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
nvlink error   : Undefined reference to '_ZN6openmc11fatal_errorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
nvlink error   : Undefined reference to 'strlen' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
clang: error: fatbinary command failed with exit code 255 (use -v to see invocation)
clang version 17.0.0 (https://github.com/llvm/llvm-project.git 091bfa76db64fbe96d0e53d99b2068cc05f6aa16)

gridley · 2023-12-08T19:13:25Z

Just want to say: I have not forgotten about this but have just had trouble debugging what's going on, will hopefully get to this soon.

gridley added 18 commits May 8, 2023 12:01

fix non unity compile

328a62f

let python API work again

74397f5

C API ability to add filters might work again now

34b9a75

fix C API tally extend

6b8e8e2

skip some tests that are expected to fail, for now

a45f1b8

reallocs added in C API for nuclide and filters

fd13291

fix gcc13 compile; use already-allocd mem in nuclide load

b7c1e7d

fix flood of warnings in gcc

43f6033

use aleady alloc mem for elements

78d5ce3

make material, thermaldata, wmp, nuclide be trivially copyable

597540d

have tally use static names

f221168

fix material reallocation

de91756

make more trivially copyable

e338ce9

fix memory leak in tally

04b6b75

use realloc for mesh extend

96a63b2

lengthen allowable tally name

e278397

fix 🐛 causing segfaults in Material

d912e76

avoid double free in realloc by resetting pointers

a7f3f01

gridley requested a review from amandalund as a code owner May 9, 2023 19:10

gridley added 2 commits June 12, 2023 18:32

use heap allocated trivially copyable names

ca69cee

dont construct string off nullptr

5a9b014

gridley closed this Jul 10, 2024

gridley deleted the small_fixes branch July 10, 2024 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable C API, depletion, and many tests #33

Re-enable C API, depletion, and many tests #33

gridley commented May 9, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023

gridley commented May 10, 2023 •

edited

Loading

gridley commented May 10, 2023

gridley commented May 10, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023 •

edited

Loading

gridley commented Jun 12, 2023

jtramm commented Jun 15, 2023

jtramm commented Jun 28, 2023

gridley commented Jun 28, 2023 •

edited

Loading

jtramm commented Jun 28, 2023

gridley commented Dec 8, 2023

Re-enable C API, depletion, and many tests #33

Re-enable C API, depletion, and many tests #33

Conversation

gridley commented May 9, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023

gridley commented May 10, 2023 • edited Loading

gridley commented May 10, 2023

gridley commented May 10, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023

jtramm commented May 10, 2023

gridley commented May 10, 2023 • edited Loading

gridley commented Jun 12, 2023

jtramm commented Jun 15, 2023

jtramm commented Jun 28, 2023

gridley commented Jun 28, 2023 • edited Loading

jtramm commented Jun 28, 2023

gridley commented Dec 8, 2023

gridley commented May 10, 2023 •

edited

Loading

gridley commented May 10, 2023 •

edited

Loading

gridley commented Jun 28, 2023 •

edited

Loading