A forecasting benchmark for LLMs. Leaderboards and datasets available at https://www.forecastbench.org.
git clone --recurse-submodules <repo-url>.git
cd llm-benchmark
cp variables.example.mk variables.mk
and set the values accordingly- Setup your Python virtual environment
make setup-python-env
source .venv/bin/activate
cd directory/containing/cloud/function
eval $(cat path/to/variables.mk | xargs) python main.py
Before creating a pull request:
- run
make lint
and fix any errors and warnings - ensure code has been deployed to Google Cloud Platform and tested (only for our devs, for others, we're happy you're contributing and we'll test this on our end).
- fork the repo
- reference the issue number (if one exists) in the commit message
- push to the fork on a branch other than
main
- create a pull request