Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROTOTYPE] Add fallback to legacy scan implementation for CPU devices and devices that lack size 32 sub-groups #1749

Merged

Conversation

mmichel11
Copy link
Contributor

This PR adds a fallback from reduce-then-scan to the legacy scan implementation in two separate cases:

  • As a fallback if the device does not support size 32 sub-groups. We explicitly require this size and must fallback if it's not supported.
  • For CPU devices. The sub-group level operations do not translate well to CPU devices and the legacy implementation is significantly faster on CPUs.

mmichel11 and others added 22 commits July 24, 2024 11:20
Test is currently compiling. None of the real device code has been
integrated, but the preliminary host code and general structure has been defined.

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
…implementation

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
… the init computation

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if.

This PR adds copy_if as an example

---------

Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
…:tuple in zip_iterator.pass (#1714)

* Fix for zip_iterator.pass in copy_if assignment

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

* Add similar fix to __simple_write_to_idx

Signed-off-by: Matthew Michel <matthew.michel@intel.com>

---------

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
…last active subgroup otherwise

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
We fallback to the old implementation if the device does not support
sub-group sizes of 32, or if the device is a CPU target.

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
…gacy_scan_fallback

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
@mmichel11 mmichel11 marked this pull request as draft July 31, 2024 16:24
@mmichel11
Copy link
Contributor Author

mmichel11 commented Jul 31, 2024

Moving back to draft. Some issues occurred when merging with the target branch.

Update: The issues have been fixed.

…cessary check

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
@mmichel11 mmichel11 marked this pull request as ready for review July 31, 2024 16:40
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
@@ -769,6 +770,15 @@ class __static_monotonic_dispatcher<::std::integer_sequence<::std::uint16_t, _X,
}
};

template <typename _ExecutionPolicy, typename _SizeType>
bool
__supports_sub_group_size(const _ExecutionPolicy& __exec, _SizeType __target_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the only usage of __target_size in this function casts it to std::size_t, would it make sense to take the parameter by std::size_t directly instead of using a template?

Suggested change
__supports_sub_group_size(const _ExecutionPolicy& __exec, _SizeType __target_size)
__supports_sub_group_size(const _ExecutionPolicy& __exec, std::size_t __target_size)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've added this and removed the static_cast below.

Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor nitpick, otherwise, I think LGTM

@@ -1052,6 +1064,23 @@ __parallel_copy_if(oneapi::dpl::__internal::__device_backend_tag __backend_tag,
oneapi::dpl::unseq_backend::__no_init_value<_Size>{},
/*_Inclusive=*/std::true_type{});
}
else
{
using _ReduceOp = ::std::plus<_Size>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::std->std

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mmichel11 mmichel11 merged commit cdd7ceb into dev/shared/reduce_then_scan_impl Jul 31, 2024
@mmichel11 mmichel11 deleted the dev/mmichel11/legacy_scan_fallback branch July 31, 2024 21:58
danhoeflinger added a commit that referenced this pull request Aug 5, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 6, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 6, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 7, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 7, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 8, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 8, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 8, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
danhoeflinger added a commit that referenced this pull request Aug 14, 2024
…s and devices that lack size 32 sub-groups (#1749)


Signed-off-by: Matthew Michel <matthew.michel@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Co-authored-by: Adam Fidel <adam.fidel@intel.com>
Co-authored-by: Dan Hoeflinger <109972525+danhoeflinger@users.noreply.github.com>
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Co-authored-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants