Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

naveentatikonda · 2023-08-18T02:45:23Z

Summary

Using float32 datatype to ingest vectors using IndexHNSW or IndexIVF (using Flat or PQ encoders) is getting expensive in terms of storage and memory especially for large scale use cases. Adding support for signed int8 and fp16 vector datatype support helps to reduce these memory footprints.

As of now, it supports uint8 for indexing Binary vectors and Scalar Quantizer has the fp16 support. Do you have any plans to support them for other methods and encoders mentioned above ?

mdouze · 2023-08-18T12:33:58Z

HNSW and IVF both support FP16, SQ8 and PQ compression, see
https://github.com/facebookresearch/faiss/wiki/The-index-factory

navneet1v · 2023-08-22T07:23:43Z

@naveentatikonda while creating the index you need to send SQfp16 instead Flat(which is fp64 or fp32) to take use of floats as 16bit.

@mdouze please correct if that is not the understanding.

naveentatikonda · 2023-08-25T22:26:39Z

@mdouze Thank you for your reply. It is actually hard to understand the encoder and decoder logic in this fp16 quantizer. Can you please confirm if we provide a quantized vector as input where each dimension is of type fp16 and try to ingest them using HNSW and SQfp16 encoder, will faiss support it ?

If it is supported, will it quantize this vector further and does the vector value change ?

naveentatikonda · 2023-08-30T20:53:41Z

@mdouze Thank you for your reply. It is actually hard to understand the encoder and decoder logic in this fp16 quantizer. Can you please confirm if we provide a quantized vector as input where each dimension is of type fp16 and try to ingest them using HNSW and SQfp16 encoder, will faiss support it ?

If it is supported, will it quantize this vector further and does the vector value change ?

@mdouze Did you get a chance to look into my question?

mdouze · 2023-09-05T13:51:27Z

Sorry I did not understand the question.
If you use SQfp16, the internal representation will be FP16.
You cannot provide a quantized vector, the index will do it at add time.
Is that clear?

naveentatikonda · 2023-11-29T10:31:01Z

@mdouze As of now, we have AVX2 SIMD optimization support for x86 architecture in ScalarQuantizer for QT_fp16. But, we don't have similar optimization for ARM based architecture. So, I'm working on adding NEON SIMD support for QT_fp16. What are your thoughts on adding NEON optimization support to QT_fp16 ?

mdouze · 2023-11-29T14:00:54Z

That would be great! There a areas of Faiss that are optimized with Neon but not the scalar quantizer. This is mainly because SQ was implemented before impl/simdlib.h existed.

mdouze · 2023-11-30T09:05:31Z

Yes sure, NEON is supported for some operations but not for scalar quantizer. This is mainly because impl/simdlib.h was not implemented when scalar quantization was added to Faiss.

…esearch#3166) Summary: ### Description We have SIMD support for x86 architecture using AVX2 optimization. But, we don't have a similar optimization for ARM architecture in Scalar Quantizer for the Quantization Type `QT_FP16`. This PR adds SIMD support for ARM using NEON optimization. ### Issues resolved facebookresearch#3014 Pull Request resolved: facebookresearch#3166 Reviewed By: algoriddle, pemazare Differential Revision: D52510486 Pulled By: mdouze fbshipit-source-id: 2bb360475a0d9e0816c8e05b44d7da7f2e6b28c5

mdouze added the help wanted label Aug 18, 2023

This was referenced Dec 6, 2023

[RFC] Faiss Scalar Quantization FP16 (SQfp16) and enabling SIMD (AVX2 and NEON) opensearch-project/k-NN#1138

Closed

Add SIMD NEON Optimization for QT_FP16 in Scalar Quantizer #3166

Closed

facebook-github-bot closed this as completed in c3aa526 Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

naveentatikonda commented Aug 18, 2023 •

edited

Loading

mdouze commented Aug 18, 2023

navneet1v commented Aug 22, 2023 •

edited

Loading

naveentatikonda commented Aug 25, 2023

naveentatikonda commented Aug 30, 2023

mdouze commented Sep 5, 2023

naveentatikonda commented Nov 29, 2023

mdouze commented Nov 29, 2023

mdouze commented Nov 30, 2023

Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

Support for signed int8 and float16 for HNSW and IVF using Flat and PQ encoders #3014

Comments

naveentatikonda commented Aug 18, 2023 • edited Loading

Summary

mdouze commented Aug 18, 2023

navneet1v commented Aug 22, 2023 • edited Loading

naveentatikonda commented Aug 25, 2023

naveentatikonda commented Aug 30, 2023

mdouze commented Sep 5, 2023

naveentatikonda commented Nov 29, 2023

mdouze commented Nov 29, 2023

mdouze commented Nov 30, 2023

naveentatikonda commented Aug 18, 2023 •

edited

Loading

navneet1v commented Aug 22, 2023 •

edited

Loading