Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sparse input for SVC and SVR #5273

Merged
merged 40 commits into from
Jun 1, 2023

Conversation

mfoerste4
Copy link
Contributor

@mfoerste4 mfoerste4 commented Mar 15, 2023

This PR adds support for sparse input to SVR and SVC. 'fit' as well as 'predict' can be called with sparse data compatible/convertible to SparseCumlArray. Support vectors in the model might also be stored as sparse data and can be retrieved as such.
This PR requires rapidsai/raft#1296 to provide sparse kernel computations.
Corresponding issue: #2197

@mfoerste4 mfoerste4 requested review from a team as code owners March 15, 2023 08:53
@github-actions github-actions bot added CMake CUDA/C++ Cython / Python Cython or Python issue labels Mar 15, 2023
@github-actions github-actions bot removed CMake conda conda issue labels May 25, 2023
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for addressing the issues. I have only a few small comments left.

cpp/include/cuml/svm/svc.hpp Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
python/cuml/tests/test_pickle.py Show resolved Hide resolved
cpp/src/svm/results.cuh Outdated Show resolved Hide resolved
cpp/include/cuml/svm/svm_model.h Outdated Show resolved Hide resolved
cpp/src/svm/sparse_util.cuh Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
@mfoerste4
Copy link
Contributor Author

Thanks Malte for addressing the issues. I have only a few small comments left.

@tfeher , thanks for reviewing, I have pushed an update where I addressed your review suggestions.

@tfeher tfeher added breaking Breaking change improvement Improvement / enhancement to an existing function labels May 26, 2023
@tfeher
Copy link
Contributor

tfeher commented May 26, 2023

Added "breaking" label because of the changes on the C++ API.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking much better. I did a somewhat brief skim over the changes so I could provide feedback more quickly.. I'll do a little more thorough review next week but so far I see only minor things.

*/
template <typename math_t>
void svcPredictSparse(const raft::handle_t& handle,
int* indptr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we accept such a limited set of types for this, we could probably eventually use the raft::csr_matrix_view but raw pointers from the Python->C++ hand-off is fine too since we really have not start porting over any of our other C++ APIs to accept mdspan directly yet.

MLCommon::Matrix::Matrix<math_t>* x_ws_matrix = nullptr;

// matrix l2 norm for RBF kernels
rmm::device_uvector<math_t> matrix_l2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, we'll be replacing all occurrences w/ the mdarray but for now we can keep these using RMM directly.

cpp/src/svm/smosolver.cuh Outdated Show resolved Hide resolved
python/cuml/svm/svm_base.pyx Outdated Show resolved Hide resolved
model_d.support_matrix.indices = <int*><uintptr_t>self.support_vectors_.indices.ptr
model_d.support_matrix.data = <double*><uintptr_t>self.support_vectors_.data.ptr
else:
model_d.support_matrix.data = <double*><uintptr_t>self.support_vectors_.ptr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a copy of the block above- can we consolidate these, maybe into their own function? Maybe something like configure_support_matrix()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - we need to distinguish in between C++ data types

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking much better. I did a somewhat brief skim over the changes so I could provide feedback more quickly.. I'll do a little more thorough review next week but so far I see only minor things.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the update! I have missed two issues in my previous review, please fix these. Otherwise the PR looks good to me.

cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
cpp/src/svm/kernelcache.cuh Outdated Show resolved Hide resolved
@mfoerste4
Copy link
Contributor Author

Thanks Malte for the update! I have missed two issues in my previous review, please fix these. Otherwise the PR looks good to me.

Thanks @tfeher for the review. I have applied your suggestions.

@mfoerste4
Copy link
Contributor Author

It's looking much better. I did a somewhat brief skim over the changes so I could provide feedback more quickly.. I'll do a little more thorough review next week but so far I see only minor things.

Thanks @cjnolet for the early feedback.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for fixing the issues. LGTM.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look great! Thanks again for these changes, @mfoerste4!

@cjnolet
Copy link
Member

cjnolet commented Jun 1, 2023

/merge

@rapids-bot rapids-bot bot merged commit 20fcb7e into rapidsai:branch-23.06 Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CUDA/C++ Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function
Projects
Development

Successfully merging this pull request may close these issues.

4 participants