Project_Prajit is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point.
The project provides an API offering all the primitives required to build private, context-aware AI applications. It follows and extends the OpenAI API standard, and supports both normal and streaming responses.
The API is divided into two logical blocks:
High-level API, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation and storage.
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering and the response generation.
Low-level API, which allows advanced users to implement their own complex pipelines:
- Embeddings generation: based on a piece of text.
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.
Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive domains like healthcare or legal is limited by a clear concern: privacy. Not being able to ensure that your data is fully under your control when using third-party AI tools is a risk those industries cannot take.
Project_Prajit is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines and other low-level building blocks. We want to make it easier for any developer to build AI applications and experiences, as well as provide a suitable extensive architecture for the community to keep contributing.
Conceptually, Project_Prajit is an API that wraps a RAG pipeline and exposes its primitives.
- The API is built using FastAPI and follows OpenAI's API scheme.
- The RAG pipeline is based on LlamaIndex.
The design of Project_Prajit allows to easily extend and adapt both the API and the RAG implementation. Some key architectural decisions are:
- Dependency Injection, decoupling the different components and layers.
- Usage of LlamaIndex abstractions such as
LLM
,BaseEmbedding
orVectorStore
, making it immediate to change the actual implementations of those abstractions. - Simplicity, adding as few layers and new abstractions as possible.
- Ready to use, providing a full implementation of the API and RAG pipeline.
Main building blocks:
- APIs are defined in
private_gpt:server:<api>
. Each package contains an<api>_router.py
(FastAPI layer) and an<api>_service.py
(the service implementation). Each Service uses LlamaIndex base abstractions instead of specific implementations, decoupling the actual implementation from its usage. - Components are placed in
private_gpt:components:<component>
. Each Component is in charge of providing actual implementations to the base abstractions used in the Services - for exampleLLMComponent
is in charge of providing an actual implementation of anLLM
(for exampleLlamaCPP
orOpenAI
).
Project_Prajit is built make use of:
- Qdrant, providing the default vector database
- Fern, providing Documentation and SDKs
- LlamaIndex, providing the base RAG framework and abstractions
This project has been strongly influenced and supported by other amazing projects like LangChain, GPT4All, LlamaCpp, Chroma and SentenceTransformers.