-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Python C code snippets
It is not always obvious how the C++ and Python layers interact. Therefore, we give some handy code in Python notebooks that can be copy/pasted to perform some useful operations.
They rely mostly on vector_to_array
and a few other Python/C++ tricks described here
The faiss.contrib.inspect_tools
module has a few useful functions to inspect the Faiss objects.
In particular inspect_tools.print_object_fields
lists all the fields of an object and their values.
Use the function faiss.contrib.inspect_tools. get_LinearTransform_matrix
, or see this code:
get_matrix_from_PCA.ipynb.
This applies to any LinearTransform
object.
For PQ: see access_PQ_centroids.ipynb.
For RQ: see demo_replace_RQ_codebooks.ipynb
Use the function faiss.contrib.inspect_tools.get_invlist
, or see this code:
get_invlists.ipynb
This does not require C++ magic. See #3555
See this code snippet: demo_hnsw_struct.ipynb alternative rendering.
See demo_access_nndescent.ipynb
See demo_merge_array_invertedlists.ipynb
We have an index file but don't know what's in it.
When accessing the Index
fields of a wrapper index, they show up as a plain Index
object.
The downcast_index
converts this plain index to the "leaf" class the index belongs to.
This snippet is a demo on how to use downcast_index
to extract all info from it:
demo_explore_indedex.ipynb
IDMap2
inherits IDMap
, so this code works for both.
This code works for both directions: convert_idmap2_idmap.ipynb
See assign_on_gpu.ipynb.
plus: how to do this for IVF training
See initial_centroids_demo.ipynb
See https://github.com/facebookresearch/faiss/issues/2455
See demo_replace_invlists.ipynb
You need a BitStringReader
, see #2285
IndexPQ is not supported on GPU, but it is relatively easy to simulate it with an IVFPQ.
The data is stored in a storage
index, which is an IndexFlatCodes
.
demo_access_NSG_data.ipynb
To get the reconstructed vectors, use index2.reconstruct(vector_id)
or index2.reconstruct_n()
.
Sometimes it is useful to implement a small callback needed by Faiss in C++. However, it may be too specific or depend to external code, so it does not make sense to include in Faiss (and Faiss is hard to compile ;-) )
In that case, you can make a SWIG wrapper for a snippet of C++.
Here is an example for an IDSelector
object that has an is_member callback: bow_id_selector.swig
To compile the code with Faiss installed via conda and SWIG 4.x on Linux:
# generate wrapper code
swig -c++ -python -I$CONDA_PREFIX/include bow_id_selector.swig
# compile generated wrapper code:
g++ -shared -O3 -g -fPIC bow_id_selector_wrap.cxx -o _bow_id_selector.so \
-I $( python -c "import distutils.sysconfig ; print(distutils.sysconfig.get_python_inc())" ) \
-I $CONDA_PREFIX/include $CONDA_PREFIX/lib/libfaiss_avx2.so
This produces bow_id_selector.py
and _bow_id_selector.so
that can be loaded in Python with
import numpy as np
import faiss
import bow_id_selector
# very small sparse CSR matrix
n = 3
indptr = np.array([0, 2, 3, 6], dtype='int32')
indices = np.array([7, 8, 3, 1, 2, 3], dtype='int32')
# don't forget swig_ptr to convert from a numpy array to a C++ pointer
selector = bow_id_selector.IDSelectorBOW(n, faiss.swig_ptr(indptr), faiss.swig_ptr(indices))
selector.set_query_words(1, 2)
selector.is_member(0) # returns False
selector.is_member(1) # returns False
selector.is_member(2) # returns True
selector.is_member(3) # crashes!
# And of course you can combine it with existing Faiss objects
params = faiss.SearchParameters(sel=selector)
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark