To run the server run it either through poetry
poetry run start
or with uvicorn
poetry run uvicorn app.app:app --host 0.0.0.0 --port=8000 --reload
The API is intentionally super simple. It support text generation, embedding, json extraction and fine tuning. Besides, it supports loading different model architectures. Because we use FastAPI and OpenAPI can found under /docs
.
Parameters
{
"prompt": "string", # The prompt to run
"token_count": 100, # (Optional) Maximum token count
"temperature": 0, # (Optional)
"verbose": false, # (Optional) Log additional debug information
"stream": false # (Optional)
}
Example
curl -XPOST http://localhost:8000/api/text_generation -H 'content-type: application/json' -d '{ "prompt": "123"}'
Parameters
{
"prompts": [
"string" # A list of text strings to embedd.
]
}
Example
curl -XPOST http://localhost:8000/api/embedding -H 'content-type: application/json' -d '{ "prompts": ["123"]}'
Parameters
{
"examples": [ # Example text that the model should be trained on
"string"
],
"steps": 100, # Training steps
"base_model": "string", # LLM base model
"name": "string" # Name of the new finetuned model
}
Example
curl -XPOST http://localhost:8000/api/finetunin/sft -H 'content-type: application/json' -d '{ "examples": ["123"], "steps": 10, "base_model": "llama2", "name": "finetuned_llama2"}'
- Support loading models in GPU memory and offloading them when switching models
- Supports loading and caching of models from S3
- Supports the most popular open source models and runtime like llama2, Mistral, vLLM + Llama, Ollama
- Supports SFT through a simple API and storing the adapter in S3
- Add generic Huggingface transformer interface
- Add more finetuning strategies
- Support Azure
- Support GCP
# Make sure you have poetry and the respective libraries installed
poetry install
pip3 install install flash-attn==2.3.1.post1 --no-build-isolation
pip3 install "transformers[torch]"
Create an issue or discussion in this repository.
Or, reach out to our team! @jakob_frick, @__anjor, @maxnajork on X or team@radiantai.com.
Thank you for your interest in contributing to our project! Before you begin writing code, it would be helpful if you read these contributing guidelines. Following them will make the contribution process easier and more efficient for everyone involved.