Fast Framework to build Enterprise RAG (Retriever Augmented Generation) Pipelines at Scale - powered by watsonx
Welcome to the SuperKnowa GitHub repository! SuperKnowa framework accelerates your Enterprise Generative AI applications to get prod-ready solutions quickly on your private data. Here, you will find a diverse collection of pluggable components designed to tackle various Generative AI use cases using Large Language Models (LLMs). Think of these components as building blocks, much like Lego pieces, that you can assemble to address a wide range of challenges in the realm of AI-driven text generation. These are battle-tested from 1M to 200M private knowledge base & scaled to billions of retriever tokens.
The overall pipeline of the SuperKnowa RAG framework & key building blocks:
Configurable components for the SuperKnowa RAG pipeline using a single file:
SuperKnowa is a powerful framework developed using watsonx (watch the video on watsonx.ai here) that harnesses the capabilities of Large Language Models (LLMs) to offer a range of advanced Generative AI use cases. This repository introduces you to the various use cases covered by SuperKnowa.
📖 Learn more about SuperKnowa in our insightful blog post:
Cover Blog - SuperKnowa: Building Enterprise RAG Solutions at Scale https://medium.com/towards-generative-ai/superknowa-simplest-framework-yet-to-swiftly-build-enterprise-rag-solutions-at-scale-ca90b49be28a
Try the SuperKnowa framework with a live application built on the private knowledge base of 1M diverse docs:
https://superknowa.tsglwatson.buildlab.cloud/
(In case you don't have IBM ID, please get it here - https://www.ibm.com/account/reg/us-en/signup?formid=urx-19776)
You can get started by updating the config.yaml
file and run the LLMQnA.py script for quickly configuring your RAG pipeline:
retriever:
indexName: superknowa
query: What is IBM Cloud?
....
reranker:
query: What is IBM Data and Analytics Reference Architecture?
...
LLMQnA:
question: What is IBM Data and Analytics Reference Architecture?
...
To explore SuperKnowa's features and capabilities, refer to the blog series, code examples, and resources provided in this repository.
For detailed instructions and examples, navigate to each component's directory. Unleash the potential of Large Language Models in your projects using SuperKnowa's Generative AI Lego Components!
Let's unlock the potential of Generative AI with SuperKnowa and shape the future of AI-powered knowledge processing!
-
-
Watson Discovery
-
-
Fine Tuning LLAMA2 7B using QLORA
-
- Capture Human Feedback
- Admin Dashboard for AI Alignment Results
Measure the alignment of AI models on the metrics of helpfulness, harmfulness and accuracy by capturing human inputs.
Build your various online & offline experiments for evaluations and compare the AI alignment results using an interactive dashboard.
The Eval_Package is a tool designed to evaluate the performance of the LLM (Language Model) on a dataset containing questions, context, and ideal answers. It allows you to run evaluations on various datasets and assess how well the Model generates the answer on dozens of statistical metrics like BLUE, ROUGE, etc.
- Evaluate LLM Model on custom datasets: Use the Eval_Package to assess the performance of your Model on datasets of your choice.
- Measure model accuracy: The package provides metrics to gauge the accuracy of the model-generated answers against the ideal answers.
The MLflow_Package is a comprehensive toolkit designed to integrate the results from the Eval_Package and efficiently track and manage experiments. It also enables you to create a leaderboard for evaluation comparisons and visualize metrics through a dashboard.
- Statistical Metrics Supported are BLEU, METEOR, ROUGE, SentenceSim Score, SimHash Score, Perplexity Score, BLEURT Score, F1 Score and BERT score.
- Experiment tracking: Utilize MLflow to keep a record of experiments, including parameters, metrics, and model artifacts generated during evaluations.
- Leaderboard creation: The package allows you to create a leaderboard, making it easy to compare the performance of different Models across multiple datasets.
- Metric visualization: Generate insightful charts and graphs through the dashboard, allowing you to visualize and analyze evaluation metrics easily.
Below is a list of Generative AI use cases built using the SuperKnowa framework.
Engage in natural language conversations with SuperKnowa's conversational Question & Answer (Q&A) system. Ask questions based on the private enterprise knowledge base, and receive detailed, context-aware responses.
Leverage SuperKnowa's "Ask your documents" feature to unlock the potential of your PDFs and text documents. SuperKnowa can help you extract relevant information, answer specific questions, and assist in information retrieval.
Effortlessly generate coherent and informative summaries with SuperKnowa's summarization feature across large text corpus using FlanT5 and UL2. Extract the main points and essential details from articles, reports, and other texts, allowing for efficient content comprehension.
SuperKnowa's abstractive summarisation feature goes beyond simple extraction using FlanUL2, and LLAMA2. It can analyze lengthy PDF documents and generate concise abstractive summaries, capturing the essence of the content. Additionally, SuperKnowa identifies key points, making it easier to comprehend and communicate complex information.
Experience the power of SuperKnowa's Text-to-SQL capability, which transforms natural language queries into structured SQL queries. Interact with databases using plain language, eliminating the need for expertise in SQL.
Created & Architected By
- Kunal Sawarkar, Chief Data Scientist
Builders
- Shivam Solanki, Senior Advisory Data Scientist
- Michael Spriggs, Principal Architect
- Kevin Huang, Sr. ML-Ops Engineer
- Abhilasha Mangal, Senior Data Scientist
- Sahil Desai, Data Scientist
- Amit Khandelwal- Senior Data Scientist
- Himadri Talukder - Senior Software Engineer
- Tyler Benson- Data Scientist
This framework is developed by Build Lab, IBM Ecosystem. Please note that this content is made available to foster Embeddable AI technology adoption and serve ecosystem partners. The content may include systems & methods pending patent with the USPTO and protected under US Patent Laws. SuperKnowa is not a product but a framework built on the top of IBM watsonx along with other products like LLAMA models from Meta & ML Flow from Databricks. Using SuperKnowa implicitly requires agreeing to the Terms and conditions of those products. This framework is made available on an as-is basis to accelerate Enterprise GenAI applications development. In case of any questions, please reach out to kunal@ibm.com.
Copyright @ 2023 IBM Corporation.