AnswerDotAI/rerankers: A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models. #934
Labels
AI-Chatbots
Topics related to advanced chatbot platforms integrating multiple AI models
ai-platform
model hosts and APIs
Git-Repo
Source code repository like gitlab or gh
MachineLearning
ML Models, Training and Inference
ml-inference
Running and serving ML models.
Models
LLM and ML model repos and links
Software2.0
Software development driven by AI and neural networks.
AnswerDotAI/rerankers: A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
Snippet
README
Why rerankers?
Rerankers are an important part of any retrieval architecture, but they're also often more obscure than other parts of the pipeline.
Sometimes, it can be hard to even know which one to use. Every problem is different, and the best model for use X is not necessarily the same one as for use Y.
Moreover, new reranking methods keep popping up: for example, RankGPT, using LLMs to rerank documents, appeared just last year, with very promising zero-shot benchmark results.
All the different reranking approaches tend to be done in their own library, with varying levels of documentation. This results in an even higher barrier to entry. New users are required to swap between multiple unfamiliar input/output formats, all with their own quirks!
rerankers seeks to address this problem by providing a simple API for all popular rerankers, no matter the architecture.
rerankers aims to be:
🪶 Lightweight. It ships with only the bare necessities as dependencies.
📖 Easy-to-understand. There's just a handful of calls to learn, and you can then use the full range of provided reranking models.
🔗 Easy-to-integrate. It should fit in just about any existing pipelines, with only a few lines of code!
💪 Easy-to-expand. Any new reranking models can be added with very little knowledge of the codebase. All you need is a new class with a rank() function call mapping a (query, [documents]) input to a RankedResults output.
🐛 Easy-to-debug. This is a beta release and there might be issues, but the codebase is conceived in such a way that most issues should be easy to track and fix ASAP.
Get Started
Installation is very simple. The core package ships with just two dependencies, tqdm and pydantic, so as to avoid any conflict with your current environment. You may then install only the dependencies required by the models you want to try out:
Usage
Load any supported reranker in a single line, regardless of the architecture:
Rerankers will always try to infer the model you're trying to use based on its name, but it's always safer to pass a model_type argument to it if you can!
Then, regardless of which reranker is loaded, use the loaded model to rank a query against documents:
You don't need to pass doc_ids! If not provided, they'll be auto-generated as integers corresponding to the index of a document in docs.
You're free to pass metadata too, and it'll be stored with the documents. It'll also be accessible in the results object:
If you'd like your code to be a bit cleaner, you can also directly construct Document objects yourself, and pass those instead. In that case, you don't need to pass separate doc_ids and metadata:
You can also use rank_async, which is essentially just a wrapper to turn rank() into a coroutine. The result will be the same:
All rerankers will return a RankedResults object, which is a pydantic object containing a list of Result objects and some other useful information, such as the original query. You can retrieve the top k results from it by running top_k():
The Result objects are transparent when trying to access the documents they store, as Document objects simply exist as an easy way to store IDs and metadata. If you want to access a given result's text or metadata, you can directly access it as a property:
And that's all you need to know to get started quickly! Check out the overview notebook for more information on the API and the different models, or the langchain example to see how to integrate this in your langchain pipeline.
Features
Legend:
✅ Supported
🟠 Implemented, but not fully fledged
📍 Not supported but intended to be in the future
⭐ Same as above, but important.
❌ Not supported & not currently planned
Models:
✅ Any standard SentenceTransformer or Transformers cross-encoder
✅ RankGPT (Available both via the original RankGPT implementation and the improved RankLLM one)
✅ T5-based pointwise rankers (InRanker, MonoT5...)
✅ LLM-based pointwise rankers (BAAI/bge-reranker-v2.5-gemma2-lightweight, etc...)
✅ Cohere, Jina, Voyage and MixedBread API rerankers
✅ FlashRank rerankers (ONNX-optimised models, very fast on CPU)
✅ ColBERT-based reranker - not a model initially designed for reranking, but does perform quite strongly in some cases. Implementation is lightweight, based only on transformers.
🟠⭐ RankLLM/RankZephyr: supported by wrapping the rank-llm library library! Support for RankZephyr/RankVicuna is untested, but RankLLM + GPT models fully works!
📍 LiT5
Features:
✅ Metadata!
✅ Reranking
✅ Consistency notebooks to ensure performance on scifact matches the litterature for any given model implementation (Except RankGPT, where results are harder to reproduce).
✅ ONNX runtime support --> Offered through FlashRank -- in line with the philosophy of the lib, we won't reinvent the wheel when @PrithivirajDamodaran is doing amazing work!
📍 Training on Python >=3.10 (via interfacing with other libraries)
❌(📍Maybe?) Training via rerankers directly
Suggested labels
None
The text was updated successfully, but these errors were encountered: