Skip to content

Release 2.5.0

Compare
Choose a tag to compare
@Sid-Data-Universe Sid-Data-Universe released this 19 Nov 03:22
· 46 commits to main since this release
7f23ad9

This release adds a fourth evaluation task (IfEval) into the current competition starting on block 4,344,030. At this time the weighting of each task will be 85% MMLU, 5% Word Sorting, 5% Fineweb, and 5% IfEval.

Subnet

  • Added new IfEval (Instruction Following) evaluation task.
  • This evaluation scores models based on how well they follow generated rules about their response. To start with this will include rules about casing, comma usage, word count, and sentence count.
  • Includes a check to make sure models are generating reasonable output. Meaning they are not using the same response for the same rules when asked different questions.

Validators

  • The expected time per evaluation cycle has increased due to the new evaluation task.

  • TTLs have been adjusted and each model is required to complete all evaluation tasks in 12 minutes.

  • Alpha has also been adjusted. Models should first receive weight after 2 cycles (~360 blocks) and will receive all weight after 17 cycles (~3060 blocks) of consecutive wins.

  • Output width is set explicitly to improve readability of pm2 rich tables in logging. Thanks coldint!

Miners

This release requires running pip install -e . to pick up the latest dependencies