Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

Closed
naveentatikonda opened this issue Aug 18, 2023 · 8 comments

Comments

@naveentatikonda
Copy link
Contributor

naveentatikonda commented Aug 18, 2023

Summary

Using float32 datatype to ingest vectors using IndexHNSW or IndexIVF (using Flat or PQ encoders) is getting expensive in terms of storage and memory especially for large scale use cases. Adding support for signed int8 and fp16 vector datatype support helps to reduce these memory footprints.

As of now, it supports uint8 for indexing Binary vectors and Scalar Quantizer has the fp16 support. Do you have any plans to support them for other methods and encoders mentioned above ?

@mdouze
Copy link
Contributor

mdouze commented Aug 18, 2023

HNSW and IVF both support FP16, SQ8 and PQ compression, see
https://github.com/facebookresearch/faiss/wiki/The-index-factory

@navneet1v
Copy link

navneet1v commented Aug 22, 2023

@naveentatikonda while creating the index you need to send SQfp16 instead Flat(which is fp64 or fp32) to take use of floats as 16bit.

@mdouze please correct if that is not the understanding.

@naveentatikonda
Copy link
Contributor Author

@mdouze Thank you for your reply. It is actually hard to understand the encoder and decoder logic in this fp16 quantizer. Can you please confirm if we provide a quantized vector as input where each dimension is of type fp16 and try to ingest them using HNSW and SQfp16 encoder, will faiss support it ?

If it is supported, will it quantize this vector further and does the vector value change ?

@naveentatikonda
Copy link
Contributor Author

@mdouze Thank you for your reply. It is actually hard to understand the encoder and decoder logic in this fp16 quantizer. Can you please confirm if we provide a quantized vector as input where each dimension is of type fp16 and try to ingest them using HNSW and SQfp16 encoder, will faiss support it ?

If it is supported, will it quantize this vector further and does the vector value change ?

@mdouze Did you get a chance to look into my question?

@mdouze
Copy link
Contributor

mdouze commented Sep 5, 2023

Sorry I did not understand the question.
If you use SQfp16, the internal representation will be FP16.
You cannot provide a quantized vector, the index will do it at add time.
Is that clear?

@naveentatikonda
Copy link
Contributor Author

@mdouze As of now, we have AVX2 SIMD optimization support for x86 architecture in ScalarQuantizer for QT_fp16. But, we don't have similar optimization for ARM based architecture. So, I'm working on adding NEON SIMD support for QT_fp16. What are your thoughts on adding NEON optimization support to QT_fp16 ?

@mdouze
Copy link
Contributor

mdouze commented Nov 29, 2023

That would be great! There a areas of Faiss that are optimized with Neon but not the scalar quantizer. This is mainly because SQ was implemented before impl/simdlib.h existed.

@mdouze
Copy link
Contributor

mdouze commented Nov 30, 2023

Yes sure, NEON is supported for some operations but not for scalar quantizer. This is mainly because impl/simdlib.h was not implemented when scalar quantization was added to Faiss.

abhinavdangeti pushed a commit to blevesearch/faiss that referenced this issue Jul 12, 2024
…esearch#3166)

Summary:
### Description
We have SIMD support for x86 architecture using AVX2 optimization. But, we don't have a similar optimization for ARM architecture in Scalar Quantizer for the Quantization Type `QT_FP16`. This PR adds SIMD support for ARM using NEON optimization.

### Issues resolved
facebookresearch#3014

Pull Request resolved: facebookresearch#3166

Reviewed By: algoriddle, pemazare

Differential Revision: D52510486

Pulled By: mdouze

fbshipit-source-id: 2bb360475a0d9e0816c8e05b44d7da7f2e6b28c5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants