[QST] Performance of NearestNeighbours kneighbours lookup #4021

thomasaarholt · 2021-07-01T10:29:12Z

What is your question?
Is the NearestNeighbors.kneighbors problem well-suited for GPU? I find it quite slow for big data on the GPU. I've ran the following on a single RTX2080TI and a 32-core AMD processor. The .fit() method is very fast, but the .kneighbors() method is not. See below of some context and benchmarks. Note that in #4020 I show the other GPU algorithms crash the kernel.

We can use:

from cuml.neighbors import NearestNeighbors
model = NearestNeighbors()
model.fit(points)
model.kneighbors(new_points)

to find the neighbors of new_points among points, if they both are arrays of coordinates (with shape (N, 2)).

While the creation and fitting of the model to the data is very quick with cuML, the kneighbours lookup is relatively slow when compared with sklearn's CPU implementation:

import numpy as np, cupy as cp
from sklearn.neighbors import NearestNeighbors as NearestNeighborsCPU
from cuml.neighbors import NearestNeighbors as NearestNeighborsGPU
from time import time

n_neighbors = 10
n_samples = 1_000_000
n_unknown = 100_000

X = 1000*np.random.random((n_samples, 2))
unknown = 1000*np.random.random((n_unknown, 2))

print('Sklearn results')
for algo in ['auto', 'ball_tree', 'kd_tree']: # 'brute' takes a *really* long time, so we skip it
    t1 = time()
    knn = NearestNeighborsCPU(n_neighbors=n_neighbors, algorithm=algo)
    knn.fit(X);
    t2 = time()
    estimates = knn.kneighbors(unknown)
    print(f"{algo}: {t2-t1:.2f} and {time()-t2:.2f}")
    
print(\n'cuML results')
X = cp.asarray(X)
unknown = cp.asarray(unknown)

for algo in ['brute']: # see cuML #4020 for why I don't include other algorithms
    t1 = time()
    knn = NearestNeighborsGPU(n_neighbors=n_neighbors,  algorithm=algo)
    knn.fit(X);
    t2 = time()
    estimates = knn.kneighbors(unknown)
    print(f"{algo}: {t2-t1:.2f} and {time()-t2:.2f}")

Doing a simple with "medium" and "large" number of samples, I see the following results:

# times in seconds, for [.fit] and [.kneighbors]
# n_samples = 1_000_000
# n_unknown = 100_000
Sklearn results
auto: 0.95 and 0.62
ball_tree: 0.91 and 1.40
kd_tree: 0.97 and 0.70

cuML results
brute: 0.01 and 2.19

# n_samples = 10_000_000
# n_unknown = 1_000_000
Sklearn results
auto: 39.92 and 31.47
ball_tree: 36.40 and 43.35
kd_tree: 38.36 and 14.45

cuML results
brute: 0.03 and 200.59

The text was updated successfully, but these errors were encountered:

teju85 · 2021-07-01T16:30:42Z

@mdoijade can you take a look at this and see why kneighbors is so slow?

dantegd · 2021-07-01T23:51:40Z

@teju85 wouldn't the difference be because this is comparing different algorithms? i.e. brute in GPU (which takes too long in CPU) vs kd_tree/ball_tree?

teju85 · 2021-07-02T00:53:49Z

:smh: good catch @dantegd! I overlooked the array in line 14. Yes, this is indeed because the comparison is being made with brute-force against indexing methods. We don't yet have indexing based methods implemented.

thomasaarholt · 2021-07-02T06:21:20Z

Right, I should have made that clearer. Just in case this wasn't noticed either, I created #4020, pointing out that the other GPU algorithms crash python on my system.

cjnolet · 2021-07-29T20:36:17Z

@thomasaarholt, as @dantegd pointed out, the differences here are strictly related to the underlying algorithms. While it may increase training time, we do support the ivpq algorithm, which should significantly minimize the query time at least.

github-actions · 2021-11-23T21:03:05Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

thomasaarholt added ? - Needs Triage Need team to review and classify question Further information is requested labels Jul 1, 2021

hcho3 removed the ? - Needs Triage Need team to review and classify label Jul 2, 2021

github-actions bot added the inactive-90d label Nov 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Performance of NearestNeighbours kneighbours lookup #4021

[QST] Performance of NearestNeighbours kneighbours lookup #4021

thomasaarholt commented Jul 1, 2021 •

edited

Loading

teju85 commented Jul 1, 2021

dantegd commented Jul 1, 2021

teju85 commented Jul 2, 2021

thomasaarholt commented Jul 2, 2021

cjnolet commented Jul 29, 2021

github-actions bot commented Nov 23, 2021

[QST] Performance of NearestNeighbours kneighbours lookup #4021

[QST] Performance of NearestNeighbours kneighbours lookup #4021

Comments

thomasaarholt commented Jul 1, 2021 • edited Loading

teju85 commented Jul 1, 2021

dantegd commented Jul 1, 2021

teju85 commented Jul 2, 2021

thomasaarholt commented Jul 2, 2021

cjnolet commented Jul 29, 2021

github-actions bot commented Nov 23, 2021

thomasaarholt commented Jul 1, 2021 •

edited

Loading