Building a chatbot for your data and pipelines is challenging because they are often too large (e.g., 1,000+ tables) to fit within the LLM context window. Cocoon addresses this by creating a RAG layer for your data and pipelines. With Cocoon's RAG, we offer a cursor-style chatbot for your data tasks.
- 👉 Online Service to clean your uploaded CSV
- 👉 Try this Google Collab Notebook for Data Warehouse RAG
- 👉 Try this Google Collab Notebook for Data Pipeline RAG
Cocoon is available on PyPI. Create a virtual env and then:
pip install cocoon_data -U
To get started, you need to connect to
- LLMs (e.g., GPT-4, Claude-3, Gemini-Ultra, or your local LLMs)
- Data Warehouses (e.g., Snowflake, Big Query, Duckdb...)
from cocoon_data import *
# if you use Open AI GPT-4
openai.api_key = 'xycabc'
# if you use Snowflake
con = snowflake.connector.connect(...)
query_widget, cocoon_workflow = create_cocoon_workflow(con)
# a helper widget to query your data warehouse
query_widget.display()
# the main panel to interact with Cocoon
cocoon_workflow.start()
🎉 You shall see the following on a notebook:
We also offer a browser UI, only for the chat over RAG feature. Simply:
pip install cocoon_data -U
cocoon_data
You shall see