-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] improve device atomic overloads for potential issues #1691
Conversation
split the file into `device_atomics.cuh` and `device_operators.cuh` separated the difinition of the device operators
move atomicCASImpl(int8 or int16) into typesAtomicCASImpl
simplify atomicMin, atomixMax add cudf::bool8 for atomic test case for atomicAdd,Min,Max add cudf::bool8 specialization for genericAtomicOperation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love these simplifications. Much cleaner and easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general. One concern and one question.
Add '__forceinline__ __device__' to `W genericAtomicOperator(W)`
Add size check assert between `long long int` and `int64_t`
remove redundant `sizeof(T)` when calling 'typesAtomicCASImpl`
remove redundant `sizeof(T)` when calling 'genericAtomicOperationImpl`
rerun tests |
rerun tests |
Add native atomicAdd(uint64_t) call for sint64_t
Add comment for `genericAtomicOperationImpl<int64_t, DeviceSum, 8>` why it uses atomicAdd(uint64) inside
Removed `genericAtomicOperation(W)` since it is not invoked for cudf::wrapper types. Merged it into `genericAtomicOperation(T)` Add size check assert at `type_reinterpret`.
cpp/src/utilities/device_atomics.cuh
Outdated
} | ||
|
||
// specialization for cudf::detail::wrapper types | ||
template <typename T, gdf_dtype dtype, typename BinaryOp, typename W = cudf::detail::wrapper<T, dtype> > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be removed. I rely on it in #1478
It will need to be added back in a future PR.
9a3a178
to
edc14cb
Compare
Fixed by #1735 |
Update/improve device atomic overloads for potential issues, and simplify the implementation.
Then, it would be easy to implement when cudf changes the underlying type of the wrappers,
or cudf introduces a new data type.
And this PR also introduce cuda native
atomicAdd
support for signed int64_t. I didn't implemented it yet since cuda does not havesigned long long int
overload, however,two's complement
representation of signed integer has the advantage that the fundamental arithmetic operations of addition, subtraction, and multiplication are identical to those for unsigned binary numbers. See also: https://en.wikipedia.org/wiki/Two%27s_complementdevice_operators.cuh
genericAtomicOperationUnderlyingType
typesAtomicOperation32|64
atomicAdd
support for signed int64_tCloses #1398
Related #1685