-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014
Comments
HNSW and IVF both support FP16, SQ8 and PQ compression, see |
@naveentatikonda while creating the index you need to send @mdouze please correct if that is not the understanding. |
@mdouze Thank you for your reply. It is actually hard to understand the encoder and decoder logic in this fp16 quantizer. Can you please confirm if we provide a quantized vector as input where each dimension is of type If it is supported, will it quantize this vector further and does the vector value change ? |
@mdouze Did you get a chance to look into my question? |
Sorry I did not understand the question. |
@mdouze As of now, we have AVX2 SIMD optimization support for x86 architecture in ScalarQuantizer for QT_fp16. But, we don't have similar optimization for ARM based architecture. So, I'm working on adding NEON SIMD support for QT_fp16. What are your thoughts on adding NEON optimization support to QT_fp16 ? |
That would be great! There a areas of Faiss that are optimized with Neon but not the scalar quantizer. This is mainly because SQ was implemented before impl/simdlib.h existed. |
Yes sure, NEON is supported for some operations but not for scalar quantizer. This is mainly because impl/simdlib.h was not implemented when scalar quantization was added to Faiss. |
…esearch#3166) Summary: ### Description We have SIMD support for x86 architecture using AVX2 optimization. But, we don't have a similar optimization for ARM architecture in Scalar Quantizer for the Quantization Type `QT_FP16`. This PR adds SIMD support for ARM using NEON optimization. ### Issues resolved facebookresearch#3014 Pull Request resolved: facebookresearch#3166 Reviewed By: algoriddle, pemazare Differential Revision: D52510486 Pulled By: mdouze fbshipit-source-id: 2bb360475a0d9e0816c8e05b44d7da7f2e6b28c5
Summary
Using float32 datatype to ingest vectors using IndexHNSW or IndexIVF (using Flat or PQ encoders) is getting expensive in terms of storage and memory especially for large scale use cases. Adding support for signed int8 and fp16 vector datatype support helps to reduce these memory footprints.
As of now, it supports uint8 for indexing Binary vectors and Scalar Quantizer has the fp16 support. Do you have any plans to support them for other methods and encoders mentioned above ?
The text was updated successfully, but these errors were encountered: