Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compatibility with HIP #289

Merged
merged 105 commits into from
Jul 19, 2024
Merged

Add compatibility with HIP #289

merged 105 commits into from
Jul 19, 2024

Conversation

csbnw
Copy link
Contributor

@csbnw csbnw commented Jul 10, 2024

Description

Add support for HIP, C++ Heterogeneous-Compute Interface for Portability. From the user's perspective, a few API breaking changes related to the cu::Context class had to be made when using cudawrappers in CUDA mode (which is the default). To enable the HIP mode, set CUDAWRAPPERS_BACKEND=HIP.

This PR is still a draft, as some minor things still need to be cleaned up and/or added. Also, the documentation and changelog still needs to be updated. Moreover, we may want to test integration with existing applications in more detail and make changes if needed.

Related issues:

Instructions to review the pull request

  • Check that CHANGELOG.md has been updated if necessary

Hanno Spreeuw and others added 30 commits July 9, 2024 10:25
…his branch.

Committer: Hanno Spreeuw <h.spreeuw@esciencecenter.nl>
…re why it works, but in the shell where you build make sure to execute 'export HIPCXX=hipcc'.
…out CUDA, not about HIP. For now, it is just completely commented out.
…nplace-perl.sh script. So this commit fixes almost all errors appearing from 'make -C build' from within cudawrappers-usage-example-locally-installed. First 'typedef uint32_t cuuint32_t;' is needed because there is no HIP equivalent of 'cuuint32_t'. We will probably need to 'shield' this with an #ifdef __HIP__ since this statement also appears in cuda.h. Next, there are a number of casts from const void* to void* because of differences in API between CUDA and HIP, i.e. differences in the types or the arguments. So you could consider these temporary fixes until these API differences have been removed; corresponding issues have been submitted to the ROCM repos. Two snippets of code involving context - includes Ctx - have been commented out for now, since I am not sure yet how the HIP equivalent of CUDA context works. 'hipFunction_attribute attribute' raised an error, while 'hipFuncAttribute attribute' worked, so this is perhaps a flaw in the perl script. There were also a few istances where the error messages - from 'make -C build' - suggested a replacement, while the perl script should have taken care of this. Since they worked, these suggestions were copied into this commit. Finally, I did not yet find a HIP equivalent of cuStreamBatchMemOp, so the code snippet involving this command was commented out, for now.
…he line 'typedef uint32_t cuuint32_t;'. It is 'protected' by '#ifdef __HIP__....#endif' because the same statement is in cuda.h. After discussion with csbnw: The code snippets with calls to cuStreamBatchMemOp and cuStreamGetCtx will remain commented out (and should probably be removed), since csbnw does not see any use for them. The error from the code snippet with void memcpyHtoHAsync(...){checkCudaCall(hipMemcpyAsync....)} was sorted out by loostrum: the direction of the copy is inferred when cuMemcpyAsync is used, but hipMemcpyAsync apparently cannot do this and the _obj stream argument is misinterpreted as a directional argument. Adding hipMemcpyDefault - as a third argument - fixes this. The other modifications are pure formatting comming from 'pre-commit run -a'.
…Found HIP replacement for CU_CTX_SCHED_BLOCKING_SYNC; inserting hipDeviceScheduleBlockingSync. Commented out all tests we could not easily replace with its HIP equivalents.
…this is to be debugged

Because the test source files do not end in .hip, set_source_file_properties has to be used to enable compilation with hip instead of the host c++ compiler
remaining: the FFT calls in HIP are declared __host__ __device__, so we must do the same
however, doing so results in calls to host-only code (e.g. the FFT destructor), so a more intricate fix is required
* Added cu::Function::occupancyMaxActiveBlocksPerMultiprocessor()

* Let target_embed_source inline local header files (#263)

* Fix working directory for linking source into object file (#264)

* Update readthedocs configuration (#265)

* Fix target_embed_source when input file is a relative path (#266)

* Removed deprecated cu::Context::setSharedMemConfig (#267)

* Extend api (#268)

* Add Device::getUUID
* Add initial nvml wrapper

* Fix target_embed_source (#269)

* Fix target embed source paths (#270)

* Fix path given to ld in target_embed_source
* Fix missing arg in string replace

* Fix CI (#271)

* Add runtime subpackage
* Fix symbols for test_nvrtc

* [pre-commit.ci] pre-commit autoupdate (#259)

* Update clang-format: v17.0.6 → v18.1.4
* Replace cmakelang with clang-format and cmake-format

* Add nvrtc::findIncludePath (#272)

* Make nvrtc::findIncludePath inline (#273)

* Fix string replacements (#275)

* Fix inline_local_includes (#276)

* Make sure to select the correct INCLUDE_PATH (#277)

* Fix typo (#278)

* [pre-commit.ci] pre-commit autoupdate (#274)

updates:
- [github.com/pre-commit/mirrors-clang-format: v18.1.4 → v18.1.5](pre-commit/mirrors-clang-format@v18.1.4...v18.1.5)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Add inlined headers as dependencies in target_embed_source (#279)

* Add inlined headers as dependencies in target_embed_source

This ensure that the embedded sources are rebuild when a header is changed

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update cmake/cudawrappers-helper.cmake

Co-authored-by: Bram Veenboer <veenboer@astron.nl>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Bram Veenboer <veenboer@astron.nl>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add HIP version of occupancyMaxActiveBlocksPerMultiprocessor

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: John Romein <romein@astron.nl>
Co-authored-by: john-romein <39552858+john-romein@users.noreply.github.com>
Co-authored-by: Bram Veenboer <veenboer@astron.nl>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
add the HIP_PLATFORM_AMD__ define so check on that as well for typedefs
@csbnw
Copy link
Contributor Author

csbnw commented Jul 16, 2024

I have updated cuda-pcie to support the current state of this branch.

This repository serves as an example of how HIP support can be added to a repository that previously only supported CUDA (using cudawrappers).

First, add a CMake option to the project:

option(BUILD_HIP "Build using HIP" OFF)

You set the CUDAWRAPPERS_BACKEND and enable HIP in CMake:

if(BUILD_HIP)
  set(CUDAWRAPPERS_GIT_TAG "add-hip") # temporary, until !289 is merged
  set(CUDAWRAPPERS_BACKEND "HIP")
  enable_language(HIP)
else()
  set(CUDAWRAPPERS_GIT_TAG "main")
endif()

Maybe the enable_language(HIP) could be inherited from cudawrappers, but I didn't find a way to achieve this.

Getting cudawrappers is the same as before, but for now, we choose the add-hip branch:

include(FetchContent)
FetchContent_Declare(
  cudawrappers
  GIT_REPOSITORY https://github.com/nlesc-recruit/cudawrappers
  GIT_TAG ${CUDAWRAPPERS_GIT_TAG})
FetchContent_MakeAvailable(cudawrappers)

Let's assume we build a simple example, which is linked to cudawrappers:

add_executable(main main.cpp)
target_link_libraries(main cudawrappers::cu)

The only thing left to do is to make sure that HIP is enabled for main.cpp as it includes <cudawrappers/cu.hpp>:

if(BUILD_HIP)
  set_source_files_properties(main.cpp PROPERTIES LANGUAGE HIP)
endif()

There doesn't seem to be a way to set language properties to targets so inheriting from cudawrappers::cu doesn't seem possible.

To me, this is about as user-friendly as it gets. No changes are needed to main.cpp at all! Of course, for more exotic CUDA stuff that isn't supported in HIP, you must add some #if defined(__HIP__) guards.

@loostrum and @john-romein, do you agree that this is the way we want to support HIP compatibility (initially)?

@loostrum
Copy link
Member

Very nice overview, I think this is indeed the way to go for now.

@loostrum
Copy link
Member

There are now some function in the cu::Context class that are only compiled with CUDA. How about adding the [[deprecated]] attribute to them? This would however also require updating to C++14. This is probably problematic for Catch2, so it could also be done in a later PR.

I think it's a good idea to add [[deprecated]], but indeed let's first see if we can update to C++14 in a separate PR.

This was linked to issues Jul 18, 2024
include/cudawrappers/cu.hpp Outdated Show resolved Hide resolved
include/cudawrappers/cu.hpp Outdated Show resolved Hide resolved
include/cudawrappers/cu.hpp Show resolved Hide resolved
cmake/cudawrappers-dependencies.cmake Show resolved Hide resolved
* Add Hanno Spreeuw
* Add HIP keyword
@csbnw csbnw merged commit c55a698 into main Jul 19, 2024
7 checks passed
@csbnw csbnw deleted the add-hip branch July 19, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HIP context support Add HIP backend
3 participants