Skip to content

Latest commit

 

History

History

2. Neural Retriever

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Information Retriever for Retrieval Augmented Generation

This repository contains Python scripts demonstrating the use of a Neural Retriever in a Retrieval Augmented Generation (RAG) pipeline. The scripts demonstrate three different implementations of a Neural Retriever using Apache Solr, Elasticsearch, and Watson Discovery as document stores.

GIF Description

Directory Contents

Retriever

Getting Started

  1. Clone this repository.
  2. Install the required dependencies (see the Dependencies section below).
  3. Modify the config.yaml to update the retriever pointing to your service
  4. Run the ProcessElastic.py to see the neural retriever in action.

Usage

Run the ProcessElastic.py after updating config.yaml to see the neural retriever in action.

Example scripts and notebook

Each script defines a function for the information retriever (SolrRetriever, ESRetriever, or WDRetriever) takes a query and returns the top matching documents from the respective document store.

Here's a basic example of how you might use the SolrRetriever:

retriever = SolrRetriever(solr_url='http://localhost:8983/solr', collection_name='my_collection')
results = retriever.retrieve('What is DataOps?')
print(results)

Dependencies

These scripts require Python 3.6 or later. They also require the following Python libraries:

  • pysolr (for solr_retriever.py)
  • elasticsearch (for es_retriever.ipynb)
  • requests (for wd_retriever.py)

You can install these libraries using pip:

pip install pysolr elasticsearch requests