-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reduce then scan algorithm for transform scan family #1762
Merged
Merged
Changes from 84 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
d07ada2
Checkpoint for reduce then scan integration
mmichel11 6244266
Introduce a parallel_backend_sycl_reduce_then_scan.h file to contain …
mmichel11 ccdb3b0
Port of kernels from two-pass scan KT branch
mmichel11 b465d84
Move the single-element last element storage for exclusive_scan after…
mmichel11 47360a0
Use init value type for init processing helper
adamfidel 3bf0602
Lower single work-group upper limit to 2048 elements (empirically found)
mmichel11 46c1a50
[PROTOTYPE] Generalized two pass algorithm and copy_if (#1700)
danhoeflinger 38c1b19
bug fix for global race on block carry-out
danhoeflinger 72d42c2
bugfix for elements to process in partial subgroup scan
danhoeflinger ecce124
[PROTOTYPE] Add unused temporary storage to single work-group scan to…
adamfidel 39ebdbe
Add temporary work-group size cap for FPGA_EMU testing
mmichel11 e4e30e1
[PROTOTYPE] Resolve conversion issues between internal tuple and std:…
mmichel11 3732c12
Use __dpl_sycl::__local_accessor
adamfidel 1745e0c
bugfix for overruning input for small non multiples of subgroup size
danhoeflinger 0921941
Check if a subgroup is active before fetching its carry and grab the …
mmichel11 8effa03
Comment out std::complex tests in scan_by_segment tests
mmichel11 c22231a
renaming __out as it seems to be a keyword
danhoeflinger 598f569
fixing device copyable for helpers
danhoeflinger 96b4fd2
Remove commented code that remained after rebase
mmichel11 8f759a3
[PROTOTYPE] Add fallback to legacy scan implementation for CPU device…
mmichel11 6da54e7
[PROTOTYPE] partition, unique families and ranges API (#1708)
danhoeflinger 13cecbf
fix windows issue regression __out
danhoeflinger 2daefab
fix for missing assigner in copy if pattern
danhoeflinger 4a83e1b
fix unique same mangled name problem
danhoeflinger 299b28b
[PROTOTYPE] Cleanup reduce-then-scan code (#1760)
mmichel11 8266882
restoring removed whitespace line
danhoeflinger 453d4ca
removing unnecessay storage type from kernel name
danhoeflinger 78e33ac
remove unique pattern family from reduce_then_scan
danhoeflinger 8267513
remove partition pattern family from reduce_then_scan
danhoeflinger d37746e
remove copy_if pattern family from reduce_then_scan
danhoeflinger 404c4ef
remove unnecessary barrier + cleanup unnecessary lazy value
danhoeflinger 060f649
clang format
danhoeflinger 0beebd1
codespell
danhoeflinger 90e6e62
restoring whitespace only changes
danhoeflinger ef5d377
removing unnecessary using
danhoeflinger bca0002
reverting formatting only changes
danhoeflinger 68c75e5
remove max and TODO
danhoeflinger dddb050
remove extra braces, add comments
danhoeflinger dc2de26
removing formatting only changes
danhoeflinger 165b1a5
removing unnecessary decay
danhoeflinger b9f0f4e
removing unused forwarding references
danhoeflinger bd144a4
clang-formatting
danhoeflinger d809051
adding comment and different threshold for different implementations
danhoeflinger 1647722
checking is_gpu rather than !is_cpu
danhoeflinger 0271b40
use dpl_bit_ceil
danhoeflinger 6cfc979
removing bad formatting only changes (::std::)
danhoeflinger cc03af1
fixing result_and_scratch_storage creation
danhoeflinger 98de25d
spelling
danhoeflinger 59933c1
fixing single pass scan KT from change to single-wg check
danhoeflinger 94e6e97
clarifying comment language
danhoeflinger ddaad55
refactor subgroup scan to reduce redundant code
danhoeflinger 1fc0f59
refactoring full block / full thread logic to remove redundancy
danhoeflinger a5753d0
passing storage container by ref
danhoeflinger 761ec51
__g -> __group_id
danhoeflinger 4d8c92d
__group_start_idx -> __group_start_id
danhoeflinger 55db83e
minor variable naming and helpers
danhoeflinger f3768bf
improving comments, removing unused variable
danhoeflinger f1361d2
__prefer_reduce_then_scan -> __is_gpu_with_sg_32
danhoeflinger b67b987
comment for temporary storage
danhoeflinger f3aec73
fold initial value into __carry_offset
danhoeflinger 15d09e2
running tally of __reduction_scan_id
danhoeflinger 6bbe469
_idx -> _id
danhoeflinger a7d00db
running tally of __load_reduction_id rather than recalculating
danhoeflinger f54e298
running tally of __reduction_id rather than recalculating
danhoeflinger d11dd6f
comment improvement
danhoeflinger 1a29790
refactor for readability
danhoeflinger e936e83
formatting
danhoeflinger 1b4f365
removing extra space
danhoeflinger 0ca6f48
rename variables for consistency
danhoeflinger df6a223
fixing misleading names
danhoeflinger 6e470e5
Address reviewer feedback
danhoeflinger 528e04a
fix bugs from 6e470e5253]
danhoeflinger 8104f1f
Simplify conversions in __gen_transform_input
mmichel11 a5367d1
Move def of __n_uniform closer to its use
adamfidel 6096e7a
Add alias for __dpl_sycl::__sub_group and replace templates
adamfidel 60c8516
auto -> real types and formatting
danhoeflinger 8121d67
fixing type of subgroup id returns
danhoeflinger 48724db
shrinking subgroup size id types
danhoeflinger 3cc61db
adjust type to depend on input range
danhoeflinger c2c7e35
idx -> id
danhoeflinger ff7b256
shrinking types, switch branch to min, remove double deref
danhoeflinger 8a36d5a
Adjust block size for reduce-then-scan based on input type (#1782)
adamfidel 9520f3c
shrinking missed types
danhoeflinger 5a928fd
bugfix for windows
danhoeflinger e57573f
fixing range types
danhoeflinger 93189b0
minor comments from review + formatting
danhoeflinger 4e4568e
Apply std:: suggestions
danhoeflinger af82182
rounding workgroup size down to mult of subgroup size
danhoeflinger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constant reference
const _OutRng& __out_rng
looks suspicious, because the range is output.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I removed the
const
here this and fixed a couple other range types in the kernel. I will go through the remaining PRs to check for similar issues. (I think output ranges for the other helpers probably have the same issue)