Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NearestNeighbors crashes with non-brute algorithms #4020

Open
Tracked by #4139
thomasaarholt opened this issue Jul 1, 2021 · 2 comments
Open
Tracked by #4139

[BUG] NearestNeighbors crashes with non-brute algorithms #4020

thomasaarholt opened this issue Jul 1, 2021 · 2 comments
Assignees
Labels
bug Something isn't working inactive-90d

Comments

@thomasaarholt
Copy link

thomasaarholt commented Jul 1, 2021

Describe the bug
I was benchmarking the NearestNeighbors code to see which algorithm was faster for looking up neighbors of a new point. The code below produces output for the 'brute' algorithm, but crashes on two servers I'm using for running the code (the two servers use identical setup). The output and error raised in the terminal is:

EDIT: I should have used a better example (using try-except), but the error below is raised for all non-brute algorithms.

brute:
        NearestNeighbors()
        fit ran successfully
        kneighbors ran successfully
ivfsq:
        NearestNeighbors()
        fit ran successfully
Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::ivfInterleavedScanImpl_32_(faiss::gpu::Tensor<float, 2, true>&, faiss::gpu::Tensor<int, 2, true>&, thrust::device_vector<void*>&, thrust::device_vector<void*>&, faiss::gpu::IndicesOptions, thrust::device_vector<int>&, int, faiss::MetricType, bool, faiss::gpu::Tensor<float, 3, true>&, faiss::gpu::GpuScalarQuantizer*, faiss::gpu::Tensor<float, 2, true>&, faiss::gpu::Tensor<long int, 2, true>&, faiss::gpu::GpuResources*) at /home/conda/feedstock_root/build_artifacts/faiss-split_1618468126454/work/faiss/gpu/impl/scan/IVFInterleaved32.cu:13; details: CUDA error 9 invalid configuration argument

Steps/Code to reproduce bug

import cupy as cp
from cuml.neighbors import NearestNeighbors
from time import time

n_neighbors = 10
n_samples = 100_000
n_unknown = 10_000

X = cp.random.random((n_samples, 2))
unknown = cp.random.random((n_samples, 2))

for algo in ['brute', 'ivfsq', 'ivfpq', 'ivfsq']:
    print(f"{algo}:")
    knn = NearestNeighbors(n_neighbors=n_neighbors,  algorithm=algo)
    print(f"\t{knn}")
    knn.fit(X)
    print("\tfit ran successfully")
    estimates = knn.kneighbors(unknown)
    print("\tkneighbors ran successfully")

Expected behavior
Code runs without errors

Environment details (please complete the following information):

  • Environment location: Bare-metal / ssh server
  • Linux Distro/Architecture: Ubuntu 16.04 amd64]
  • GPU Model/Driver: RTX2080TI, 465.19.01 (this is on a server with multiple CUDA versions available)
  • CUDA: 11.0
  • Method of cuDF & cuML install: conda
    (rapids-21.06) [thomasaar@ml7 ~]$ conda list

packages in environment at /itf-fi-ml/home/thomasaar/.conda/envs/rapids-21.06:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge
anyio 3.2.1 py38h578d9bd_0 conda-forge
argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge
arrow-cpp 1.0.1 py38hb823b37_42_cuda conda-forge
arrow-cpp-proc 3.0.0 cuda conda-forge
async_generator 1.10 py_0 conda-forge
attrs 21.2.0 pyhd8ed1ab_0 conda-forge
aws-c-cal 0.5.11 h95a6274_0 conda-forge
aws-c-common 0.6.2 h7f98852_0 conda-forge
aws-c-event-stream 0.2.7 h3541f99_13 conda-forge
aws-c-io 0.10.5 hfb6a706_0 conda-forge
aws-checksums 0.1.11 ha31a3da_7 conda-forge
aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge
babel 2.9.1 pyh44b312d_0 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
bleach 3.3.0 pyh44b312d_0 conda-forge
blessings 1.7 pypi_0 pypi
bokeh 2.3.2 py38h578d9bd_0 conda-forge
brotli 1.0.9 h9c3ff4c_4 conda-forge
brotlipy 0.7.0 py38h497a2fe_1001 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h7f98852_1 conda-forge
ca-certificates 2021.5.30 ha878542_0 conda-forge
cachetools 4.2.2 pyhd8ed1ab_0 conda-forge
certifi 2021.5.30 py38h578d9bd_0 conda-forge
cffi 1.14.5 py38ha65f79e_0 conda-forge
chardet 4.0.0 py38h578d9bd_1 conda-forge
click 8.0.1 py38h578d9bd_0 conda-forge
cloudpickle 1.6.0 py_0 conda-forge
cryptography 3.4.7 py38ha5dfef3_0 conda-forge
cudatoolkit 11.0.221 h6bb024c_0 nvidia
cudf 21.06.01 cuda_11.0_py38_g101fc0fda4_2 rapidsai
cuml 21.06.02 cuda11.0_py38_g7dfbf8d9e_0 rapidsai
cupy 9.2.0 py38hc350bd8_0 conda-forge
cycler 0.10.0 py_2 conda-forge
cytoolz 0.11.0 py38h497a2fe_3 conda-forge
dask 2021.5.1 pyhd8ed1ab_0 conda-forge
dask-core 2021.5.1 pyhd8ed1ab_0 conda-forge
dask-cudf 21.06.01 py38_g101fc0fda4_2 rapidsai
decorator 5.0.9 pyhd8ed1ab_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
distributed 2021.5.1 py38h578d9bd_0 conda-forge
dlpack 0.5 h9c3ff4c_0 conda-forge
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
faiss-proc 1.0.0 cuda rapidsai
fastavro 1.4.2 py38h497a2fe_0 conda-forge
fastrlock 0.6 py38h709712a_1 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
fsspec 2021.6.1 pyhd8ed1ab_0 conda-forge
gflags 2.2.2 he1b5a44_1004 conda-forge
glog 0.5.0 h48cff8f_0 conda-forge
gpustat 0.6.0 pypi_0 pypi
grpc-cpp 1.38.1 h36ce80c_0 conda-forge
heapdict 1.0.1 py_0 conda-forge
icu 68.1 h58526e2_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
importlib-metadata 4.6.0 py38h578d9bd_0 conda-forge
ipykernel 5.5.5 py38hd0cf306_0 conda-forge
ipympl 0.7.0 pyhd8ed1ab_0 conda-forge
ipython 7.25.0 py38hd0cf306_1 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.6.3 pyhd3deb0d_0 conda-forge
jbig 2.1 h7f98852_2003 conda-forge
jedi 0.18.0 py38h578d9bd_2 conda-forge
jinja2 3.0.1 pyhd8ed1ab_0 conda-forge
joblib 1.0.1 pyhd8ed1ab_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 pyhd8ed1ab_3 conda-forge
jupyter_client 6.1.12 pyhd8ed1ab_0 conda-forge
jupyter_core 4.7.1 py38h578d9bd_0 conda-forge
jupyter_server 1.9.0 pyhd8ed1ab_0 conda-forge
jupyterlab 3.0.16 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 2.6.0 pyhd8ed1ab_0 conda-forge
jupyterlab_widgets 1.0.0 pyhd8ed1ab_1 conda-forge
kiwisolver 1.3.1 py38h1fd1430_1 conda-forge
krb5 1.19.1 hcc1bbae_0 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge
lerc 2.2.1 h9c3ff4c_0 conda-forge
libblas 3.9.0 9_openblas conda-forge
libcblas 3.9.0 9_openblas conda-forge
libcudf 21.06.01 cuda11.0_g101fc0fda4_2 rapidsai
libcuml 21.06.02 cuda11.0_g7dfbf8d9e_0 rapidsai
libcumlprims 21.06.00 cuda11.0_gfda2e6c_0 nvidia
libcurl 7.77.0 h2574ce0_0 conda-forge
libdeflate 1.7 h7f98852_5 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 hcdb4288_3 conda-forge
libfaiss 1.7.0 cuda110h8045045_8_cuda conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 9.3.0 h2828fa1_19 conda-forge
libgfortran-ng 9.3.0 hff62375_19 conda-forge
libgfortran5 9.3.0 hff62375_19 conda-forge
libgomp 9.3.0 h2828fa1_19 conda-forge
libhwloc 2.3.0 h5e5b7d1_1 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 9_openblas conda-forge
libllvm10 10.0.1 he513fc3_3 conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libopenblas 0.3.15 pthreads_h8fe5266_1 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libprotobuf 3.16.0 h780b84a_0 conda-forge
librmm 21.06.00 cuda11.0_gee432a0_0 rapidsai
libsodium 1.0.18 h36c2ea0_1 conda-forge
libssh2 1.9.0 ha56f1ee_6 conda-forge
libstdcxx-ng 9.3.0 h6de172a_19 conda-forge
libthrift 0.14.2 he6d91bd_1 conda-forge
libtiff 4.3.0 hf544144_1 conda-forge
libutf8proc 2.6.1 h7f98852_0 conda-forge
libwebp-base 1.2.0 h7f98852_2 conda-forge
libxml2 2.9.12 h72842e0_0 conda-forge
llvmlite 0.36.0 py38h4630a5e_0 conda-forge
locket 0.2.0 py_2 conda-forge
lz4-c 1.9.3 h9c3ff4c_0 conda-forge
markupsafe 2.0.1 py38h497a2fe_0 conda-forge
matplotlib-base 3.4.2 py38hcc49a3a_0 conda-forge
matplotlib-inline 0.1.2 pyhd8ed1ab_2 conda-forge
mistune 0.8.4 py38h497a2fe_1004 conda-forge
msgpack-python 1.0.2 py38h1fd1430_1 conda-forge
nbclassic 0.3.1 pyhd8ed1ab_1 conda-forge
nbclient 0.5.3 pyhd8ed1ab_0 conda-forge
nbconvert 6.1.0 py38h578d9bd_0 conda-forge
nbformat 5.1.3 pyhd8ed1ab_0 conda-forge
nccl 2.9.9.1 h96e36e3_0 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge
notebook 6.4.0 pyha770c72_0 conda-forge
numba 0.53.1 py38h8b71fd7_1 conda-forge
numpy 1.21.0 py38h9894fe3_0 conda-forge
nvidia-ml-py3 7.352.0 pypi_0 pypi
nvtx 0.2.3 py38h497a2fe_0 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openssl 1.1.1k h7f98852_0 conda-forge
orc 1.6.8 h58a87f1_0 conda-forge
packaging 20.9 pyh44b312d_0 conda-forge
pandas 1.2.5 py38h1abd341_0 conda-forge
pandoc 2.14.0.3 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
parquet-cpp 1.5.1 2 conda-forge
parso 0.8.2 pyhd8ed1ab_0 conda-forge
partd 1.2.0 pyhd8ed1ab_0 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.2.0 py38ha0e1e83_1 conda-forge
pip 21.1.3 pyhd8ed1ab_0 conda-forge
prometheus_client 0.11.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.19 pyha770c72_0 conda-forge
protobuf 3.16.0 py38h709712a_0 conda-forge
psutil 5.8.0 py38h497a2fe_1 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pyarrow 1.0.1 py38hb53058b_42_cuda conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pygments 2.9.0 pyhd8ed1ab_0 conda-forge
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyrsistent 0.17.3 py38h497a2fe_2 conda-forge
pysocks 1.7.1 py38h578d9bd_3 conda-forge
python 3.8.10 h49503c6_1_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python_abi 3.8 2_cp38 conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
pyyaml 5.4.1 py38h497a2fe_0 conda-forge
pyzmq 22.1.0 py38h2035c66_0 conda-forge
re2 2021.06.01 h9c3ff4c_0 conda-forge
readline 8.1 h46c0cb4_0 conda-forge
requests 2.25.1 pyhd3deb0d_0 conda-forge
requests-unixsocket 0.2.0 py_0 conda-forge
rmm 21.06.00 cuda_11.0_py38_gee432a0_0 rapidsai
s2n 1.0.10 h9b69904_0 conda-forge
scikit-learn 0.24.2 py38hdc147b9_0 conda-forge
scipy 1.7.0 py38h7b17777_0 conda-forge
send2trash 1.7.1 pyhd8ed1ab_0 conda-forge
setuptools 49.6.0 py38h578d9bd_3 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
snappy 1.1.8 he1b5a44_3 conda-forge
sniffio 1.2.0 py38h578d9bd_1 conda-forge
sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge
spdlog 1.8.5 h4bd325d_0 conda-forge
sqlite 3.36.0 h9cd32fc_0 conda-forge
tblib 1.7.0 pyhd8ed1ab_0 conda-forge
terminado 0.10.1 py38h578d9bd_0 conda-forge
testpath 0.5.0 pyhd8ed1ab_0 conda-forge
threadpoolctl 2.1.0 pyh5ca1d4c_0 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
toolz 0.11.1 py_0 conda-forge
tornado 6.1 py38h497a2fe_1 conda-forge
traitlets 5.0.5 py_0 conda-forge
treelite 1.3.0 py38hd08a91b_0 conda-forge
treelite-runtime 1.3.0 pypi_0 pypi
typing_extensions 3.10.0.0 pyha770c72_0 conda-forge
ucx 1.9.0+gcd9efd3 cuda11.0_0 rapidsai
ucx-proc 1.0.0 gpu rapidsai
ucx-py 0.20.0 py38_gcd9efd3_0 rapidsai
urllib3 1.26.6 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
websocket-client 0.57.0 py38h578d9bd_4 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
widgetsnbextension 3.5.1 py38h578d9bd_4 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zeromq 4.3.4 h9c3ff4c_0 conda-forge
zict 2.0.0 py_0 conda-forge
zipp 3.4.1 pyhd8ed1ab_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.5.0 ha95c52a_0 conda-forge

Additional context
Add any other context about the problem here.

@thomasaarholt thomasaarholt added ? - Needs Triage Need team to review and classify bug Something isn't working labels Jul 1, 2021
@hcho3 hcho3 removed the ? - Needs Triage Need team to review and classify label Jul 2, 2021
@thomasaarholt
Copy link
Author

Having actually looked properly at the error message, this seems related to the faiss library. Similar error in facebookresearch/faiss#1793.

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working inactive-90d
Projects
None yet
Development

No branches or pull requests

3 participants