Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable C API, depletion, and many tests #33

Closed
wants to merge 20 commits into from

Conversation

gridley
Copy link

@gridley gridley commented May 9, 2023

Many, but not all, of the tests work now. I've also marked a few of the tests as things we should skip for now. I'm submitting this as intermediate work on the way to getting pytest usable here again.

Also, we can now run depletion here, which is nice. Previously stuff would crash and burn on depletion problems.

Closes #26 and #32

@gridley gridley requested a review from amandalund as a code owner May 9, 2023 19:10
@jtramm
Copy link

jtramm commented May 10, 2023

Thanks so much for fixing all these tests -- these are super useful fixes! Would be awesome to have most of the python tests working again. At first glance the fixes in the PR generally look good, though all the HPC systems at ANL are down this week, so I won't be able to do any testing with this until next week.

I did have a few questions though on the new Named template class:

  1. Do we actually need to access string names on device?
  2. If we don't need to access them, is there any harm in just leaving them as std::string, and assuming they are not copied correctly?
  3. If we do need to access the names on device, is there any reason why the various classes that have a name field would need very different static name lengths? Why introduce an inheritance operation just to add a single field to a bunch of classes that would be ok sharing the same static length of that field? It seems like it would be more straightforward to just convert the std::string name_ fields to something like Name name_.
  4. Adding to the above point, if name fields are needed on device, is there a reason they need to be trivially copyable? As most of the classes in OpenMC that have name_ fields are already not trivially copyable, it could make sense to make this field dynamically sized and copy it in the same manner as other dynamic fields, as this would be one less macro to worry about.

@gridley
Copy link
Author

gridley commented May 10, 2023

If we don't need to access them, is there any harm in just leaving them as std::string, and assuming they are not copied correctly?

Yes. std::string causes segfaults when its destructor runs if we have expanded an array of objects which contain std::string's has their destructors run. I'm not opposed to leaving pointers to names rather than statically sized names, but simply removing std::string and not having to worry about running its copy constructor when resizing arrays is pretty nice.

@gridley
Copy link
Author

gridley commented May 10, 2023

3. If we do need to access the names on device, is there any reason why the various classes that have a name field would need very different static name lengths? Why introduce an inheritance operation just to add a single field to a bunch of classes that would be ok sharing the same static length of that field? It seems like it would be more straightforward to just convert the std::string name_ fields to something like Name name_.

Firstly, its nice to have getters and setters that protect us in copying to a static array, so that we can either warn about truncation or change the underlying implementation of how names are stored, e.g. switch off to pointers to elsewhere.

To answer the first question, elements and isotopes can be expected to have at most 8 bytes of data, which keeps alignment of the struct as if we're adding one double precision number. It's nice to minimize how much extra space we take, so it's templated. On the other hand, thermal scattering data names are a bit longer so it makes sense to allow 16 characters. Lastly, cells and lattices are given longer names sometimes in the test suite, hence the choice of 32 characters.

Another reason that inheritance solves this problem cleanly is that we often require std::string to be used as our interchange format for character data, e.g. in the HDF5 interface. This way, if we switch off to a char array, the calls to HDF5 stuff, and std::string::operator+ still work as expected, since the name() method builds an std::string off the underlying array.

Lastly, inheritance in this case is nothing more than adding a data member. It's just more semantically meaningful.

@gridley
Copy link
Author

gridley commented May 10, 2023

4. s most of the classes in OpenMC that have name_ fields are already not trivially copyable

Now that std::string is removed from them, they are trivially copyable. This is why std::realloc now works with the various classes that inherit from Named now.

@gridley
Copy link
Author

gridley commented May 10, 2023

As for accessing the names on device, no, it's definitely not necessary. Just a plus for debugging down the road. We could replace these with a vector and call it a day just as easily.

@jtramm
Copy link

jtramm commented May 10, 2023

Ah, when I say trivially copyable, I was thinking of host -> device copying, rather than host -> host copying. In host -> device copying, we still have to manually copy the vector data individually, so most objects are not trivially copyable in that context. At some point, we should switch over to using OpenMP custom mappers so that the host -> device copying becomes trivial, but we do not utilize that feature yet.

To your last point -- why not just make the name_ field an openmc::vector<char>? As an optimization, this would also push the unused name_ field data out of the actual contiguous storage regions of the class objects, which could improve locality of accesses to their member fields.

@gridley
Copy link
Author

gridley commented May 10, 2023

Yeah, I agree with you on that. It's a nice micro optimization to have. I'll revise this PR to use that technique instead!

Still, vector isn't compatible with std::string out of the box, so it might make sense to have it wrapped in a class that makes it straightforward to interface with.

@jtramm
Copy link

jtramm commented May 10, 2023

Sounds good! Yeah, I can see the utility of defining a Name class as you do now and just having it be a normal Name name_; member field in the classes that use it.

@gridley
Copy link
Author

gridley commented May 10, 2023

So you definitely want to remove the inheritance approach? If we're going to refactor it like this, I'll probably name it TriviallyCopyableString or something like that.

@gridley
Copy link
Author

gridley commented Jun 12, 2023

Hey John, stuff should be good-to-go now! I've left it as a mixin class though, since we already were using member functions like set_name, it makes sense to do this. Otherwise I'll just be copying and pasting repeated code in various places. Inheritance in this case is nothing more than adding a data member and adding some methods.

@jtramm
Copy link

jtramm commented Jun 15, 2023

Sounds reasonable -- I'll take another look and will do some testing on this branch. Thanks again for putting these fixes together!

@jtramm
Copy link

jtramm commented Jun 28, 2023

I tried compiling with LLVM with A100 as a target, I'm getting compile time error:

clang++: error: unknown argument: '-fext-numeric-literals'

When I remove this argument from the CMakeLists.txt and try again, I get a link-time error:

[ 81%] Linking CXX shared library lib/libopenmc.so
nvlink error   : Undefined reference to 'strlen' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to 'isatty' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to 'exit' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt24__throw_out_of_range_fmtPKcz' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSo3putEc' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNSo5flushEv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt16__throw_bad_castv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNKSt5ctypeIcE13_M_widen_initEv' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE5rfindEcm' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
nvlink error   : Undefined reference to '_ZSt4cerr' in '/tmp/unity_0_cxx.cxx-nvptx64-nvidia-cuda-sm_80-b4e0c5.cubin'
clang: error: fatbinary command failed with exit code 255 (use -v to see invocation)
clang version 17.0.0 (https://github.com/llvm/llvm-project.git 091bfa76db64fbe96d0e53d99b2068cc05f6aa16)
Target: nvptx64-nvidia-cuda
Thread model: posix
InstalledDir: /soft/compilers/llvm/main-20230628/bin
clang: note: diagnostic msg: Error generating preprocessed source(s).
/soft/compilers/llvm/master-nightly/bin/clang-linker-wrapper: error: 'clang' failed
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/libopenmc.dir/build.make:106: lib/libopenmc.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:226: CMakeFiles/libopenmc.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

@gridley
Copy link
Author

gridley commented Jun 28, 2023

Oh yeah, fext-numeric-literals is a gcc thing. Lets you write complex literals very easily, but we should move to the more portable approach of using namespace std::complex_literals;. Will include a commit for that.

On the linking, looks like gcc transitively is including a system header that LLVM is not. Thanks for pointing this out, will also test with LLVM. Or maybe it's trying to generate device code containing strlen. Hm. Will look into it.

@jtramm
Copy link

jtramm commented Jun 28, 2023

I re-ran without the unity build to perhaps narrow down the problem, it gives:

[  8%] Linking CXX shared library lib/libopenmc.so
nvlink error   : Undefined reference to '_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
nvlink error   : Undefined reference to '_ZN6openmc11fatal_errorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
nvlink error   : Undefined reference to 'strlen' in '/tmp/mesh.cpp-nvptx64-nvidia-cuda-sm_80-1850e9.cubin'
clang: error: fatbinary command failed with exit code 255 (use -v to see invocation)
clang version 17.0.0 (https://github.com/llvm/llvm-project.git 091bfa76db64fbe96d0e53d99b2068cc05f6aa16)

@gridley
Copy link
Author

gridley commented Dec 8, 2023

Just want to say: I have not forgotten about this but have just had trouble debugging what's going on, will hopefully get to this soon.

@gridley gridley closed this Jul 10, 2024
@gridley gridley deleted the small_fixes branch July 10, 2024 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

import openmc.lib fails
2 participants