Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Replace numpy mod op in Series.hash_encode #935

Closed
cmgreen210 opened this issue Feb 13, 2019 · 11 comments
Closed

[FEA] Replace numpy mod op in Series.hash_encode #935

cmgreen210 opened this issue Feb 13, 2019 · 11 comments
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.

Comments

@cmgreen210
Copy link
Contributor

The current implementation of hash_encode is bottlenecked at the numpy call https://github.com/rapidsai/cudf/blob/branch-0.6/python/cudf/dataframe/series.py#L1078. This computation should be moved to the gpu.

@cmgreen210
Copy link
Contributor Author

I will work on a fix for this.

@cmgreen210
Copy link
Contributor Author

This will be easier than expected. We'll change the line to hashed_values = np.mod(hashed_values.data.to_array(), stop)

In [44]: %time np.mod(hv, 100)
CPU times: user 24.8 s, sys: 404 ms, total: 25.2 s
Wall time: 25.2 s
Out[44]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

In [45]: %time np.mod(hv.data.to_array(), 100)
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.58 ms
Out[45]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

@kkraus14
Copy link
Collaborator

This will be easier than expected. We'll change the line to hashed_values = np.mod(hashed_values.data.to_array(), stop)

In [44]: %time np.mod(hv, 100)
CPU times: user 24.8 s, sys: 404 ms, total: 25.2 s
Wall time: 25.2 s
Out[44]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

In [45]: %time np.mod(hv.data.to_array(), 100)
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.58 ms
Out[45]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

That will still run it on the CPU and trigger a device to host copy on the .to_array() call.

@kkraus14 kkraus14 added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. labels Feb 13, 2019
@kkraus14
Copy link
Collaborator

I don't recall if we have a modulo binary op in libcudf that you'd probably want here, @jrhemstad?

@cmgreen210
Copy link
Contributor Author

This will be easier than expected. We'll change the line to hashed_values = np.mod(hashed_values.data.to_array(), stop)

In [44]: %time np.mod(hv, 100)
CPU times: user 24.8 s, sys: 404 ms, total: 25.2 s
Wall time: 25.2 s
Out[44]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

In [45]: %time np.mod(hv.data.to_array(), 100)
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.58 ms
Out[45]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

That will still run it on the CPU and trigger a device to host copy on the .to_array() call.

Good point. But the speed up is probably good enough for now. wdyt?

@kkraus14
Copy link
Collaborator

This will be easier than expected. We'll change the line to hashed_values = np.mod(hashed_values.data.to_array(), stop)

In [44]: %time np.mod(hv, 100)
CPU times: user 24.8 s, sys: 404 ms, total: 25.2 s
Wall time: 25.2 s
Out[44]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

In [45]: %time np.mod(hv.data.to_array(), 100)
CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 1.58 ms
Out[45]: array([ 9, 69, 85, ..., 87,  1, 22], dtype=int32)

That will still run it on the CPU and trigger a device to host copy on the .to_array() call.

Good point. But the speed up is probably good enough for now. wdyt?

Implementing a binary modulo op on the GPU should be trivial. If the arrays are larger or we call this on a bunch of columns the allocations / memory copies would become expensive.

@jrhemstad
Copy link
Contributor

I don't recall if we have a modulo binary op in libcudf that you'd probably want here, @jrhemstad?

So you want a binary op of doing a modulo of a column by a scalar? This will be added in #892

@cmgreen210
Copy link
Contributor Author

cmgreen210 commented Feb 13, 2019

@jrhemstad yes. Will this be added by the 0.6 release? If not might want to do an intermediate fix.

@kkraus14
Copy link
Collaborator

@jrhemstead yes. Will this be added by the 0.6 release? If not might want to do an intermediate fix.

@cmgreen210 as a stop gap it should be pretty straightforward to write a numba kernel that does a modulo operator until we move it into libcudf.

@cmgreen210
Copy link
Contributor Author

@jrhemstead yes. Will this be added by the 0.6 release? If not might want to do an intermediate fix.

@cmgreen210 as a stop gap it should be pretty straightforward to write a numba kernel that does a modulo operator until we move it into libcudf.

@kkraus14 Sounds good. I'll have that up asap.

@devavret
Copy link
Contributor

The modulo implementation is now here. It's not merged, but it's tested to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

4 participants