Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proof of concept #2

Merged
merged 64 commits into from
Jul 24, 2024
Merged

Add proof of concept #2

merged 64 commits into from
Jul 24, 2024

Conversation

mborland
Copy link
Member

Closes: #1
Based off @jzmaddock original PR here: boostorg/math#127

The rules seem to be more relaxed than in the original PR. assert and static_assert are allowed. Static variables are still not allowed, but I think that can be worked around using C++17's inline variables for global variables. The runner is using NVIDIA Tesla T4 (arch 70), and the CI provides version 12.5.0 of the NVIDIA toolkit which is the latest available. Locally I am using RHEL 9.4 with an RTX 3060 (arch 85) with the same version of the NVIDIA toolkit.

CC: @NAThompson, @ckormanyos

@@ -272,7 +271,7 @@ void test_spots(RealType)
BOOST_CHECK_EQUAL(kurtosis_excess(arcsine_01), -1.5); // 3/2
BOOST_CHECK_EQUAL(support(arcsine_01).first, 0); //
BOOST_CHECK_EQUAL(range(arcsine_01).first, 0); //
BOOST_MATH_CHECK_THROW(mode(arcsine_01), std::domain_error); // Two modes at x_min and x_max, so throw instead.
BOOST_CHECK_THROW(mode(arcsine_01), std::domain_error); // Two modes at x_min and x_max, so throw instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking... do we need these changes to the tests, or is it just a side effect of merging in an old branch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having compilation issues with BOOST_MATH_CHECK_THROW but the version from Boost.Test worked just fine. Since the tests are already using Boost.test for everything else I figured it wasn't a big change.

@jzmaddock
Copy link
Contributor

Thanks for this Matt!

Some random thoughts that have occurred to me:

  • It's probably possible to build with b2 if we wanted to (though for a handful of CUDA tests it makes little difference), if we assume the compiler is command line compatible with say gcc, then a user-config.jam that looks like:
using gcc : some-cuda-name :  "some-path/nvcc"  : <cxxflags>-fspecial-cuda-option ;

would do the trick, same for msvc.

@mborland
Copy link
Member Author

Thanks for this Matt!

Some random thoughts that have occurred to me:

  • It's probably possible to build with b2 if we wanted to (though for a handful of CUDA tests it makes little difference), if we assume the compiler is command line compatible with say gcc, then a user-config.jam that looks like:
using gcc : some-cuda-name :  "some-path/nvcc"  : <cxxflags>-fspecial-cuda-option ;

would do the trick, same for msvc.

Could be worth a try. Peter also pointed me to: https://github.com/tee3/boost-build-nvcc.

I didn't realize you could do that without the hardware installed.

I have this installed on my local machine as well. ChatGPT says I should just be able to replace __host__ __device__ with SYCL_EXTERNAL when SYCL_LANGUAGE_VERSION is defined. I'll have to pull out the intel docs at some point to validate if that's true or not. Boost.Compute was OpenCL, but since their last commit is over 5 years ago it only supports obsolete versions which is unfortunate.

  • Thinking out loud here, on reflection, assert's aren't really that useful in a device context - I would presume folks would want everything optimised anyway, plus what does an assert on the GPU actually do?

The CUDA threads are similar to CPU fibers so only the thread gets terminated not the entire process on an assertion. Since all of the assertions are using our own macro it would be easy to replace them with nothing if __CUDACC__ is defined.

@mborland
Copy link
Member Author

I have the one test running fine locally with CUDA:

1: Test command: /home/mborland/Documents/boost/build/stage/bin/boost_cuda_math-test_arcsine_cdf_double
1: Working Directory: /home/mborland/Documents/boost/build/libs/cuda-math/test
1: Test timeout computed to be: 1500
1: [Vector operation on 50000 elements]
1: CUDA kernel launch with 98 blocks of 512 threads
1: CUDA kernal done in 0.020311s
1: Test PASSED with calculation time: 0.000759265s
1: Done
1: Failed to deinitialize the device! error=driver shutting down
1/1 Test #1: run-boost_cuda_math-test_arcsine_cdf_double ...   Passed    0.32 sec

I'll see if changing some macros around enables support for SYCL.

@mborland
Copy link
Member Author

The beta function is now green with both SYCL and CUDA

@mborland
Copy link
Member Author

Merging this since it's generally synchronized with the linked PR in math

@mborland mborland merged commit d77a111 into master Jul 24, 2024
2 checks passed
@mborland mborland deleted the CI branch July 24, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cmake based testing
3 participants