-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] libcudf: generic reduction
and scan
support
#1005
[REVIEW] libcudf: generic reduction
and scan
support
#1005
Conversation
Supported in `ReduceDispatcher` Not tested yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some unit tests that perform aggregations for non-arithmetic types?
New file: reduction_operators.cuh
See https://github.com/rapidsai/cudf/pull/892/files#diff-4b0b6cd3d7dabc1501cc00b5b13d9370R93 for |
reduction
and scan
support
This is intermediate implementation. The grid size = 1 at this point. ToDo: work with grids using atomic operation.
Rename: ReduceOp::launch() -> Reduce() Remove: ReduceOp::launch_onece()
Single step reduction is performed by using `atomicCAS`.
Use `constexpr bool is_nonarithmetic_op` instead of `DeviceForNonArithmetic`
Hi @jrhemstad Lines 110 to 114 in 78a3cc8
I'd like to make sure if you mean gdf_reduction should write back the result into host memory.If the result should be at device memory, using pointer of gdf_data in device memory seems enough.
|
How does this new reduction code handle overflows? |
Can one of the admins verify this patch? |
So, I suggest that we should use the opportunity of touching this code to clearly document this fact in the relevant Doxygen comments. |
OK, I will document it at API doc after I implement output precision support. |
Use scalar retval instead of numpy array Remove [0] suffix from min() for gpu_scale use np.float64( ) instead of astype(f8) because it's not numpy array.
- changed the behavior if input column is empty - minor correction
This commit fails at cython
This reverts commit d035194.
Move enums from cudf/types.h to reduction.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print statements need to be removed and then question regarding overflows
Move `get_scalar_value` into `cudf_cpp`
…to bug_reduction_non_arithmetic
Merge
gdf_sum
,gdf_product
,gdf_sum_of_squares
,gdf_min
,gdf_max
into a singlecudf::reduction
API.Reneme
gdf_prefixsum
intocudf::scan
and support 'min', 'max', 'product' operation.Add
min
andmax
tocudf::reduction
for support non-arithmetic types, e.g.,GDF_CATEGORY
,GDF_DATE32
,GDF_DATE64
,GDF_TIMESTAMP
.Use single step reduction for
cudf::reduction
and removegdf_reduce_optimal_output_size
.Use
gdf_scalar
for output ofgdf_reduction
.Closes #443 (Add unit tests for functionality in reductions.cu)
Closes #446 (CUDA reductions should have independent input and output types)
Closes #954 (libcudf reductions should support non-arithmetic types for some operations)
Closes #978 (libcudf should provide a generic
scan
function)Related #1224 (Calling sum on a boolean column returns a boolean)
reduction
tasks:min
,max
operation support for non-arithmetic types (Issue [BUG] libcudf reductions should support non-arithmetic types for some operations #954)gdf_scalar
for output ofgdf_reduction
GDF_EXPECTS
macro and throw exceptions if failed.gdf_reduce_optimal_output_size
since it is required for 2-step reduction (related to Issuegdf_reduce_optimal_output_size()
is a misnomer #610)gdf_reduction
gdf_reduction
scan
tasks:min
,max
,product
for scangdf_prefixsum
intogdf_scan
(Issue [FEA] libcudf should provide a genericscan
function #978)gdf_scan
gdf_scan
gdf_scan
GDF_EXPECTS
macro and throw exceptions if failed.This PL also included these macros:
CUDF_FAIL
CUDF_EXPECT_NO_THROW
Then,
RMM_TRY
has been changed to throw exception at failed.Examples:
CUDF_FAIL(msg)
is same withCUDF_EXPECTS(false, msg)
CUDF_EXPECT_NO_THROW
is used only in gtests, and it is the utility macro for debugging.The testing is same with
EXPECT_NO_THROW
in gtests, but it prints out the error message fromCUDF_FAIL
andCUDF_EXPECTS
.