pro-api

Introduction

This is a project that centrally manages a large model API and can call multiple backend services through OpenAI's API interface format. The project aims to simplify interactions with different AI models and support the calling of multiple backend services.

Supported backend services

Currently supported backend services include: OpenAI, Anthropic, Gemini, Vertex, Cloudflare, DeepBricks, OpenRouter, etc.

Background functions

The background provides query of request logs and query of usage statistics. If the background is not started, only request forwarding is performed and no information is recorded.

Configuration

Using the api.yaml configuration file, multiple models can be configured, and each model can be configured with multiple backend services to support load balancing. The following is an example of the api.yaml configuration file:

api.yaml

providers:
  - provider: openai # Service provider
    name: ZhiPuQingYan # Service name
    base_url: https://open.bigmodel.cn/api/paas/v4 # Service address
    api_key: Please enter your api_key
    model:
      - glm-4-flash # Model name
      

  - provider: gemini
    name: Gemini
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: Please enter your API key
    model:
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-flash: gpt-4o
    balance: # Configure load balancing. No configuration is default to 1
      - gemini-1.5-pro: 1 #Indicates that the model weight under this name is 1
      - gemini-1.5-flash: 1 #Indicates that the model weight under this name is 2
      - gemini-flash: 1 #Indicates that the model weight under this name is 2


  - provider: openai
    name: doubao
    base_url: https://ark.cn-beijing.volces.com/api/v3
    api_key: Please enter your api_key
    model:
      - ep-20240906033439-zrc2x: doubao-pro-128k # You can simplify the model name to doubao-pro-128k
      - ep-20240613130011-c2zgx: doubao-pro-32k
      - ep-20240729175503-5bbf7: moonshot-v1-128k

  - provider: openai
    name: SiliconFlow
    base_url: https://api.siliconflow.cn/v1
    api_key: Please enter your api_key
    model:
      - Qwen/Qwen2-72B-Instruct: qwen2-72b
      - Qwen/Qwen1.5-110B-Chat: qwen1.5-110b
      - deepseek-ai/DeepSeek-V2-Chat: deepseek-chat
      - deepseek-ai/DeepSeek-Coder-V2-Instruct: deepseek-coder
      - Qwen/Qwen2-7B-Instruct: qwen2-7b
      - Qwen/Qwen2-7B-Instruct: gpt-3.5-turbo
      - Qwen/Qwen2-1.5B-Instruct: qwen2-1.5b
      - Qwen/Qwen1.5-7B-Chat: qwen1.5-7b-chat
      - THUDM/glm-4-9b-chat: glm-4-9b-chat
      - THUDM/chatglm3-6b: chatglm3-6b
      - 01-ai/Yi-1.5-9B-Chat-16K: yi-1.5-9b-chat-16k
      - 01-ai/Yi-1.5-6B-Chat: yi-1.5-6b-chat
      - google/gemma-2-9b-it: gemma-2-9b
      - internlm/internlm2_5-7b-chat: internlm-7b-chat
      - meta-llama/Meta-Llama-3-8B-Instruct: meta-llama-3-8b
      - meta-llama/Meta-Llama-3.1-8B-Instruct: meta-llama-3.1-8b
      - mistralai/Mistral-7B-Instruct-v0.2: mistral-7b


  - provider: openai
    name: deepseek
    base_url: https://api.deepseek.com/v1
    api_key: Please enter your api_key
    model:
      - deepseek-chat
      - deepseek-coder


  - provider: vertexai_claude
    name: vertexai_claude
    PROJECT_ID: Please enter
    CLIENT_ID: Please enter
    CLIENT_SECRET: Please fill in
REFRESH_TOKEN: Please fill in
model:
- claude-3-5-sonnet@20240620
- claude-3-5-sonnet@20240620: claude-3-5-sonnet

  - provider: vertexai_gemini
    name: vertexai_gemini
    PROJECT_ID: Please fill in
    CLIENT_ID: Please fill in
    CLIENT_SECRET: Please fill in
    REFRESH_TOKEN: Please fill in
    model:
      - gemini-1.5-flash-001


  - provider: cohere
    name: cohere
    base_url: https://api.cohere.com/v1
    api_key: Please enter
    model:
      - command-r-plus-08-2024
      - command-r-plus-04-2024: gpt-4
      - command-r-plus
      - command-nightly
      - command-light
      - command-light-nightly

  provider: cloudflare
    name: cloudflare
    api_key: Please enter
    account_id: Please enter
    model:
      - ‘@cf/qwen/qwen1.5-14b-chat-awq’: qwen1.5-14b
      - ‘@hf/thebloke/deepseek-coder-6.7b-instruct-awq’


  - provider: openai
    name: openrouter
    base_url: https://openrouter.ai/api/v1
    api_key: s Please fill in
    model:
      - mattshumer/reflection-70b:free: reflection-70b
      - nousresearch/hermes-3-llama-3.1-405b:free: llama-3.1-405b



tokens:
  - api_key: sk-111111
    model:
      - glm* # wildcard *
      - all # all means all

  - api_key: sk-222222
    model:
      - gpt-3.5-turbo

server:
    default_model: glm-4-flash # If no match is found, this default model is used
    debug: false
    admin_server: false # Whether to enable the background function. If not enabled, only forwarding is performed without any logging
    db_cache: false # Return the last successful response if the content is the same
    save_log_file: false
    db_path: sqlite:///./data/request_log.db
    username: admin # Background user name
    password: admin # Background password
    jwt_secret_key: admin # Fill in whatever you like, it's random

VertexAI parameter acquisition tutorial

Configure load balancing

Models with the same model name can be load balanced

The default weight is 1

  - provider: gemini
    name: Gemini1
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: Please fill in
    model:
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-flash : gemini-flash
    balance: # Load balancing
      - gemini-1.5-pro: 1 #indicates that the model weight under this name is 1
      - gemini-1.5-flash: 1 #indicates that the model weight under this name is 2
      - gemini-flash: 1 #indicates that the model weight under this name is 2

  - provider: gemini
    name: Gemini2
    base_url: https://generativelanguage.googleapis.com/v1beta
    api_key: Please fill in
    model:
      - gemini-1.5-pro
      - gemini-1.5-flash
      - gemini-1.5-flash : gemini-flash
    balance: # Load balancing
      - gemini-1.5-pro: 2 # Indicates that the model weight under this name is 1
      - gemini-1.5-flash: 2 # indicates that the model weight under this name is 2
      - gemini-flash: 3 # indicates that the model weight under this name is 2

Explanation of the above configuration:

For example:

Current weight information

Gemini1‘s gemini-1.5-flash weight 1
Gemini2’s gemini-1.5-flash weight 2

When requesting gemini-1.5-flash

1st time gemini-1.5-flash for Gemini1
2nd time gemini-1.5-flash for Gemini2
3rd time gemini-1.5-flash for Gemini2
4th time gemini-1.5-flash for Gemini1
5th time gemini-1.5-flash for Gemini2

vercel deployment

Click the button below to deploy to Vercel with one click:

The following environment variables need to be set during deployment:

config_url: URL of the remote configuration file
secret_key: key used for encryption (can be left blank if encryption is not required) If encryption is required, use aes-128-ecb encryption. If not encrypted, give the plaintext configuration content

After deployment, access the domain name assigned by Vercel to use the API.

Note: Please ensure that your remote configuration file (config_url) can be accessed publicly, otherwise Vercel will not be able to retrieve the configuration information.

Here I give the simplest configuration content

config_url=https://The address where you can access the download configuration/api.yaml

providers:
  - provider: openai
    name: deepseek
    base_url: https://api.deepseek.com/v1
    api_key: sk-xxxxxxxxxxxxx
    model:
      - deepseek-chat
      - deepseek-coder

tokens:
  - api_key: sk-123456
    model:
      - all

server:
    default_model: deepseek-chat

Docker local deployment

Start the container

Start using the local api.yaml configuration file

docker run -d \
  --name pro-api \
  -p 8001:8000 \
  -v $(pwd)/api.yaml:/app/api.yaml \
  -v $(pwd)/data:/app/data:rw \
  duolabmeng/pro-api:latest

Start with the remote api.yaml configuration file

docker run -d \
  --name pro-api \
  -e config_url=http://你的服务器/api.yaml \
  -e secret_key=123456789 \
  -p 8001:8000 \
  -v $(pwd)/api.yaml:/app/api.yaml
-v $(pwd)/data:/app/data:rw
duolabmeng/pro-api:latest

config_url automatically downloads the remote configuration file secret_key encrypted with aes, ECB, 128 bits, if you want to be safe remember to enable the aes password, if you don't fill it in, you will get the plaintext configuration

If you want to use Docker Compose

services:
  pro-api:
    container_name: pro-api
    image: duolabmeng/pro-api:latest
    environment:
      - config_url=http://file_url/api.yaml
      - secret_key=123456789
    ports:
      - 8001:8000
    volumes:
      - ./api.yaml:/app/api.yaml
      - ./data/:/app/data:rw

For example, if you are not in a position to modify the configuration file on a certain platform, you can upload the configuration file to a hosting service, which can provide a direct link for pro-api to download. config_url is this direct link. If you don't want to restart the container to update the configuration, you can just refresh the configuration by accessing /reload_config.

Restart the Docker image with one click

set -eu
docker pull duolabmeng/pro-api:latest
docker rm -f pro-api
docker run --user root -p 8001:8000 -dit --name pro-api
-v ./api.yaml:/app/api.yaml
duolabmeng/pro-api:latest
docker logs -f pro-api

RESTful curl test

curl -X POST http://127.0.0.1:8000/v1/chat/completions \
-H ‘Content-Type: application/json’ \
-H ‘Authorization: Bearer ${API}’ \
-d ‘{’model‘: “gpt-4o”,’messages‘: [{’role‘: “user”, “content”: “Hello”}],’stream‘: true}’

Help

If you cannot install dependencies on some cloud platforms, you can directly install the dependencies in the running directory and then start

pip install -r requirements.txt --no-user -t ./app

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github		.github
app		app
assets		assets
docs		docs
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
README_CN.md		README_CN.md
docker-compose.yml		docker-compose.yml
readme.md		readme.md
requirements.txt		requirements.txt
run.sh		run.sh
update.sh		update.sh
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pro-api

Introduction

Supported backend services

Background functions

Configuration

Configure load balancing

vercel deployment

Docker local deployment

Help

Star History

About

Releases

Packages

Languages

License

duolabmeng6/pro-api

Folders and files

Latest commit

History

Repository files navigation

pro-api

Introduction

Supported backend services

Background functions

Configuration

Configure load balancing

vercel deployment

Docker local deployment

Help

Star History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages