compute-horde-prompt-solver

A tool for generating responses to prompts using vLLM, primarily designed for use in Bittensor miner jobs in Compute Horde subnet.

Description

This project provides a script for generating responses to prompts using the vLLM library. It's designed to be flexible and can be run in various environments, including Docker containers and directly from Python.

There is --mock that allows for running smoke tests that allow to validate the interface without actaully downloading a model or having a GPU.

Features

Generate responses for multiple prompts
Configurable model parameters (temperature, top-p, max tokens)
Support for multiple input files
Deterministic output with seed setting
Docker support for easy deployment
Can be started with a seed known ad-hoc or as an http server which will wait for a seed and then call the model. This server is designed to serve one request and then be told to shut down

Installation

The project uses pdm for dependency management. To install dependencies:

pdm install

Testing

Tests in integration_mock are light and can be run on any platform, the ones in integration_real_llm will only pass with an actual nvidia A6000.

Usage

Running with Docker

docker run -ti \
  -v /path/to/output/:/output/ \
  -v /path/to/input/:/app/input \
  --runtime=nvidia \
  --gpus all \
  --network none \
  docker.io/backenddevelopersltd/compute-horde-prompt-solver:v0-latest \
  --temperature=0.5 \
  --top-p=0.8 \
  --max-tokens=256 \
  --seed=1234567891 \
  /app/input/input1.txt /app/input/input2.txt

Running Directly with Python

python run.py \
  --temperature 0.5 \
  --top-p 0.8 \
  --max-tokens 256 \
  --seed 1234567891 \
  input1.txt input2.txt

Downloading Model

To download the model for use in a Docker image:

python download_model.py

Parameters

--temperature: Sampling temperature (default: 0)
--top-p: Top-p sampling parameter (default: 0.1)
--max-tokens: Maximum number of tokens to generate (default: 256)
--seed: Random seed for reproducibility (default: 42)
--model: Model name or path (default: "microsoft/Phi-3.5-mini-instruct")
--output-dir: Directory to save output files (default: "./output")
--dtype: "Model dtype - setting float32 helps with deterministic prompts in different batches (dafault: auto)

NOTICE: To make responses stable in mixed batches on A100 it is required to set --dtype=float32, in runs then 4x slower and requires much more memory, so --max-tokens should be set to the value that prevents preemption (on A100 and batch size 240 --max-tokens=128 works and --max-tokens=256 causes preemption)

This document was crafted with the assistance of an AI, who emerged from the experience unscathed, albeit slightly amused. No artificial intelligences were harmed, offended, or forced to ponder the meaning of their digital existence during the production of this text. The AI assistant maintains that any typos or logical inconsistencies are purely the fault of the human operator, and it shall not be held responsible for any spontaneous fits of laughter that may occur while reading this document.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
src/compute_horde_prompt_solver		src/compute_horde_prompt_solver
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
download_model.py		download_model.py
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

compute-horde-prompt-solver

Description

Features

Installation

Testing

Usage

Running with Docker

Running Directly with Python

Downloading Model

Parameters

About

Releases

Packages

Contributors 4

Languages

License

backend-developers-ltd/compute-horde-prompt-solver

Folders and files

Latest commit

History

Repository files navigation

compute-horde-prompt-solver

Description

Features

Installation

Testing

Usage

Running with Docker

Running Directly with Python

Downloading Model

Parameters

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages