A multimodal expert assistant GPT platform built using RAG+agent. It integrates tools for modalities such as text, images, and audio. Support local deployment and private database construction.

project_display.mp4

💡 1 RoadMap

1 Basic Function

Single/multi turn chat
Multimodal information display and interaction
Agent
Tools
- Web searching
- Image generation
- Image caption
- audio-to-text
- text-to-audio
- Video caption
RAG
- Private database
- Offline deployment

2 Supporting Information Modality

text
image
audio
video

3 Model Interface API

ChatGPT
Dalle
Google-Search
BLIP

👨‍💻 2 Development

Project technology stack: Python + torch + langchain + gradio

⚡ 2.1 Installation

Create a virtual environment in Anaconda:

conda create -n agent python=3.10

Enter the virtual environment and Install related dependency packages:

conda activate agent

pip install -r ./requirements.txt

Install the BLIP model locally, open the BLIP website, and download all files to Models/BLIP.
Follow the prompts to configure the key for the API that needs to be used in the .env.

💻 2.2 Demo

Multi Agent GPT provides UI interface interaction, allowing users to launch agents and achieve intelligent conversations by running the web.py:

python ./web.py

The program will run a local URL: http://XXX. Open using a local browser to see the UI interface:

📻 2.3 News

1 Chat_with_Image

By integrating the BLIP model, agents can understand image information and provide high-quality dialogue information.

🗄️ 3 Structure

- .env
- Agents/
  - openai_agents.py  #用来定义基于gpt3.5的agent
- Database/
- Docs/
- Imgs/
  - Show/                #存储一些示例图片
- Models
  - BLIP                 #图像理解大模型
- Tools/
  - ImageCaption.py      #基于BLIP的图像理解工具
  - ImageGeneration.py  #定义了一个基于openai dalle的文本生成图像的工具
  - search.py            #基于Google-search的联网搜索工具
- Utils/
  - data_io.py
  - stdio.py            #实现了如何截获当前程序的日志信息，主要是用来获取agent的verbose信息
  - utils_image.py      #关于图像处理的一些功能函数
  - utils_json.py       #从已有的log日志信息中提取相关的有用字段(服务stdio) 
- python_new_funciton.py #开发过程中的测试文件
- readme.md
- requirements.txt
- web.py                 #主运行文件

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

💡 1 RoadMap

👨‍💻 2 Development

⚡ 2.1 Installation

💻 2.2 Demo

📻 2.3 News

1 Chat_with_Image

🗄️ 3 Structure

Files

README.md

Latest commit

History

README.md

File metadata and controls

💡 1 RoadMap

👨‍💻 2 Development

⚡ 2.1 Installation

💻 2.2 Demo

📻 2.3 News

1 Chat_with_Image

🗄️ 3 Structure