-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROTOTYPE] Generalized two pass algorithm and copy_if #1700
[PROTOTYPE] Generalized two pass algorithm and copy_if #1700
Conversation
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
This reverts commit da8574e.
…ges of __parallel_transform_scan_base) Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments.
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First few minor things I see.
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
… type is not trivially copyable (#1707) Signed-off-by: Matthew Michel <matthew.michel@intel.com>
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Show resolved
Hide resolved
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h
Outdated
Show resolved
Hide resolved
Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
…a future already) Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…returns a future already)" This reverts commit c787091.
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if. This PR adds copy_if as an example --------- Signed-off-by: Dan Hoeflinger <dan.hoeflinger@intel.com> Signed-off-by: Matthew Michel <matthew.michel@intel.com> Co-authored-by: Adam Fidel <110841220+adamfidel@users.noreply.github.com> Co-authored-by: Matthew Michel <106704043+mmichel11@users.noreply.github.com>
Summary
This PR changes the two pass algorithm to be more generalized for use with other scan-like algorithms like copy_if.
This PR adds copy_if as an example. partition and unique patterns should follow easily. Set algorithms have not been evaluated, but should also be possible.
Structural changes
The "bridge" between blocks now uses a saved carry-out value from the previous block (where it used to depend on the output of the previous block). The output of the previous block only applies to transform_scan patterns, and not scan_copy algorithms. It also required different logic for inclusive and exclusive scans. The change to use the carry unifies this logic between inclusive and exclusive scans.
The following operations have been defined to encode a generalized scan-like pattern:
_GenReduceInput
: a function which accepts the input range and index to generate the data needed by the main output used in the reduction operation (to calculate the global carries)_GenScanInput
: a function which accepts the input range and index to generate the data needed by the final scan and write operations, for scan patterns_ScanPred
: a unary function applied to the ouput of_GenScanInput
to extract the component used in the scan, but not the part only required for the final write operation_ReduceOp
: a binary function which is used in the reduction and scan operations_FinalOp
: A function which accepts output range, index, and output of_GenScanInput
applied to the input range.For transform_scan, these are defined as follows:
_GenReduceInput
: apply the transfom to input at index and return_GenScanInput
: apply the transfom to input at index and return_ScanPred
: no_op passthrough_ReduceOp
: user supplied binary scan operation_FinalOp
: Simple write value to output range at indexFor copy_if these are defined as follows:
_GenReduceInput
: apply user supplied predicate to input at index and return1
for true,0
for false_GenScanInput
: a tuple with 3 elements:1) apply user supplied predicate to input at index and return
1
for true,0
for false (contribution to count)2) apply user supplied predicate to input at index (condition for final copy)
3) original value (value for final copy)
_ScanPred
:std::get<0>(input)
_ReduceOp
:std::plus
_FinalOp
: if the second element of the tuple is true, write the third element to the output range using the index found in the first element of the tupleOdds and ends
Makes the two pass scan pattern asynchronous, depending on
__result_and_scratch_space
for async deallocationChanges the single workgroup copy_if to have a smaller threshold, and to use the
__result_and_scratch_space
struct to match return future type with the copy_if implementation.Limits range to only launch non-empty workgroups (removed early exit for empty workgroups)
TODO next:
Replace usages of
__parallel_transform_scan_base
to route to__parallel_transform_reduce_then_scan
instead. Modify the calling code as necessary. This should cover the rest of the scan-like patterns. Prioritize partition, then unique, then finally, set algorithms as time allows.