Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SoA layout for vector #95

Merged
merged 2 commits into from
Apr 12, 2024

Conversation

niermann999
Copy link
Contributor

@niermann999 niermann999 commented Mar 13, 2023

This adds a new vector wrapper class that allows for an SoA layout, in particular, to enable us to use the Vc::Vector type efficiently. The new storage/vector keeps an array-like data structure, that holds the vector elements (say, x, y, z). Both the array type, as well as the value type of the elements are templated (storage/vector< algebraic vector dim, aos/soa value type (scalar vs e.g SIMD vector), array-like storage for the algebraic vector elements >), so that combinations like these are possible:

  • 3-dim AoS std::array based vector (not necessary, since this would essentially duplicate the array plugin, while begin more complicated): storage::vector<3, scalar_t, std::array>
  • 3-dim AoS vertical vectorized (Not there, yet, but would re-implement the current vc_vc plugin, while making the extra vc_array4 wrapper superfluous): storage::vector<3, scalar_t, Vc::SimdArray>
  • 3-dim SoA of size N, std::array based: storage::vector<3, std::array<scalar_t, N>, std::array>
  • 3-dim vetorized SoA (in this PR): storage::vector<3, Vc::Vector<scalar_t>, std::array>

Also adds benchmarks to compare to the std::array and Eigen AoS plugins. The benchmarks contain a base class that holds the basic configuration (e.g. number of warmup samples) and a derived vector benchmark class that holds the samples of random inititialized vectors and is templated on the kind of vector operation to be benchmarked (unary and binary are possible). The benchmarks for the getter namespace also make use of the vector benchmark class. The benchmarks for single and double precision are then instantiated per plugin, including a prescription on how to produce a random vector.

@niermann999
Copy link
Contributor Author

niermann999 commented Mar 13, 2023

This is still VERY WIP

@niermann999 niermann999 marked this pull request as ready for review March 16, 2023 13:36
@niermann999
Copy link
Contributor Author

Since this PR is already huge, the SoA layout for std::array will be done in a separate PR

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tons that I don't understand in here... 😦 Could you possibly post a more detailed explanation of the goals into the PR? SoA is of course good. I just don't see yet how this code here is really meant to work... 😕

.gitignore Outdated Show resolved Hide resolved
@niermann999
Copy link
Contributor Author

OK, I added a bit more of an explanation :)

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the issues on macOS really insurmountable? I really don't like how that code is disabled right now. 😦

At the same time, let's discuss this face to face at some point. I'd love to understand better where this thing is meant to be heading.

math/cmath/CMakeLists.txt Outdated Show resolved Hide resolved
math/vc_soa/include/algebra/math/vc_soa.hpp Show resolved Hide resolved
storage/common/include/algebra/storage/array_operators.hpp Outdated Show resolved Hide resolved
@niermann999
Copy link
Contributor Author

Are the issues on macOS really insurmountable? I really don't like how that code is disabled right now.

Me neither, but I am not sure how to debug it. Basically, as soon as I call e.g. Vc::atan2 the release build on xcode throws Illegal while running the tests. The debug test is fine though

@niermann999 niermann999 force-pushed the vectorized-algebra branch 3 times, most recently from 998a7cb to c0b49af Compare September 21, 2023 12:08
@krasznaa
Copy link
Member

krasznaa commented Oct 7, 2023

P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)

@krasznaa
Copy link
Member

krasznaa commented Oct 9, 2023

Note that unrelated to the templating issues I also had to make these changes:

diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
index 4000dda..2ddd055 100644
--- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
+++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
@@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if (state.thread_index() == 0 and this->m_cfg.do_sleep()) {
+    if (state.thread_index() == 0 && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

@@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) {
+    if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

MSVC just does not like the and keyword. 😦

@niermann999
Copy link
Contributor Author

Note that unrelated to the templating issues I also had to make these changes:

diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
index 4000dda..2ddd055 100644
--- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
+++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
@@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if (state.thread_index() == 0 and this->m_cfg.do_sleep()) {
+    if (state.thread_index() == 0 && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

@@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) {
+    if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

MSVC just does not like the and keyword. 😦

Yes, I realized :) It is already gone from the PR #97 , but I have to put it here as well

@niermann999
Copy link
Contributor Author

P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)

Btw, what were those warnings about? I started seeing some in detray, too, although it had previously stopped after we rolled back to version 34.0 (and now started again)

Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the Windows/MSVC issues gone, I'll try to find some time in the evening to have a look at the Mac issue. 🤔

In the future at one point the benchmarking code could also be simplified/unified a bit, but for this PR this setup should be fine as is...

storage/common/include/algebra/storage/vector.hpp Outdated Show resolved Hide resolved
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise I don't have objections. Just some smaller cmake updates, and I'm happy for this to go in. 😄

benchmarks/CMakeLists.txt Show resolved Hide resolved
storage/vc/CMakeLists.txt Outdated Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
…n particular, to enable us to use the Vc::Vector type efficiently. The new storage/vector keeps an array-like data structure, that holds the vector elements (say, x, y, z). Both the array type, as well as the value type of the elements are templated (storage/vector< algebraic vector dim, aos/soa value type (scalar vs e.g SIMD vector), array-like storage for the algebraic vector elements >), so that combinations like these are possible:

                3-dim AoS std::array based vector (not necessary, since this would essentially duplicate the array plugin, while begin more complicated): storage::vector<3, scalar_t, std::array>
                3-dim AoS vertical vectorized (Not there, yet, but would re-implement the current vc_vc plugin, while making the extra vc_array4 wrapper superfluous): storage::vector<3, scalar_t, Vc::SimdArray>
                3-dim SoA of size N, std::array based: storage::vector<3, std::array<scalar_t, N>, std::array>
                3-dim vetorized SoA (in this PR): storage::vector<3, Vc::Vector<scalar_t>, std::array>

            Also adds benchmarks to compare to the std::array and Eigen AoS plugins. The benchmarks contain a base class that holds the basic configuration (e.g. number of warmup samples) and a derived vector benchmark class that holds the samples of random inititialized vectors and is templated on the kind of vector operation to be benchmarked (unary and binary are possible). The benchmarks for the getter namespace also make use of the vector benchmark class. The benchmarks for single and double precision are then instantiated per plugin, including a prescription on how to produce a random vector.
Copy link
Member

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything should be resolved now... 🤔

@niermann999 niermann999 merged commit 9e684de into acts-project:main Apr 12, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants