Releases: ShishirPatil/gorilla
Berkeley Function Calling Leaderboard Updates (v1.1)
Highlights
🏆 Berkeley Function Calling Leaderboard V2 along with Live data
What's Changed
- Added Agent Arena Frontend Client to Gorilla Repository by @NithikYekollu in #586
- [BFCL] Add BFCL_V2_Live Dataset by @HuanzhiMao in #580
- Create an issue template for BFCL by @ShishirPatil in #599
- [BFCL] Relocate Formatting Instructions and Function Documentation to System Prompt by @HuanzhiMao in #593
Full Changelog: v1.0...v1.1
Berkeley Function Calling Leaderboard Updates (v1.0)
Highlights
🏆 We are thrilled to announce the stable v1.0 release of the Berkeley Function Calling Leaderboard data-set and eval-pipeline! A heartfelt thank you to all our contributors and users for your enthusiastic engagement and support throughout v1. We are just getting started! Buckle-up for v2 🚀 🚀 🚀
What's Changed
- better handle float value comparison by @vandyxiaowei in #407
- Bump pymysql from 1.1.0 to 1.1.1 in /goex by @dependabot in #453
- Fixes For NexusHandler by @VenkatKS in #437
- [BFCL] PR#407 Evaluation Pipeline Robustness Patch by @HuanzhiMao in #462
- Add firefunction-v2 to the leaderboard by @pgarbacki in #470
- [BFCL] Add Claude 3.5 Sonnet Function Calling Infernece Inference by @Fanjia-Yan in #480
- [BFCL] Standardize Model Name Among handler_map and eval_runner_helper by @HuanzhiMao in #439
- Remove redundant tokens from GPT-handler by @hellovai in #490
- [GoEx] Undo Minor Bug Fix + README Minor Improvement by @royh02 in #468
- [BFCL] Add ability to evaluate Nemotron-4-340B-Instruct by @Fanjia-Yan in #489
- fix some data issues in parallel/parallel multiple answers by @vandyxiaowei in #423
- [BFCL] Add Support for GLM-4-9B function calling inference by @Fanjia-Yan in #474
- [BFCL] Sanity check is now optional by @ShishirPatil in #496
- [BFCL] Improved tree-sitter java, javascript installation by @CharlieJCJ in #505
- [BFCL] Fix Possible Answer for AST Parallel and Parallel_Multiple Category by @HuanzhiMao in #503
- [BFCL] Add Test Dataset to Repository by @HuanzhiMao in #504
- [BFCL] Support Category-Specific Generation for OSS Model, Remove eval_data_compilation Step by @HuanzhiMao in #512
- [BFCL] Fix Double-Casting Issue in model_handler for Java and JS category. by @HuanzhiMao in #516
- [BFCL] Fix Dataset Issue for executable_parallel_multiple Category by @HuanzhiMao in #522
- [BFCL] add ibm-granite-20b-functioncallling model by @MayankAgarwal in #525
- [BFCL] Overhaul apply_function_credential_config.py for Enhanced Usability by @HuanzhiMao in #508
- Fixed the warning message "Setting
pad_token_id
toeos_token_id
:1… by @dineshkumarsarangapani in #110 - [BFCL] Specify package version in requirements.txt by @HuanzhiMao in #515
- [BFCL] Standardize TEST_CATEGORY Among eval_runner.py and openfunctions_evaluation.py by @HuanzhiMao in #506
- fix line return by @fantasist in #531
- [BFCL] Apply Fix to Newly Introduced Model Handler Missed in Previous PR Merge by @HuanzhiMao in #536
- [RAFT] Fix Datapoint Field in Formatter for Data Generation by @HuanzhiMao in #535
- [BFCL] Fix language_specific_pre_processing for Java and JavaScript Test Category by @HuanzhiMao in #538
- [BFCL] Patch Generation Script for Locally Hosted OSS model by @HuanzhiMao in #537
- [BFCL] Support Multi-Model Multi-Category Generation; Add Index to Dataset; Handle vLLM Benign Error by @HuanzhiMao in #540
- Add NousResearch/{Hermes-2-Pro-Llama-3-8B,Hermes-2-Theta-Llama-3-8B} models by @alonsosilvaallende in #542
- [BFCL] Fix Dataset Pre-Processing for Java and JavaScript Test Category, Part 2 by @HuanzhiMao in #545
- Add Salesforce xLAM handler and fix minor issues by @zuxin666 in #532
- Add NousResearch/Hermes-2-{Pro-Llama-3-80B,Theta-Llama-3-80B} by @alonsosilvaallende in #556
- Add Yi Handler by @fantasist in #543
- Add more descriptive error message in eval_runner.py by @alonsosilvaallende in #552
- [BFCL] Fix JS type converter to handle dictionaries with array values by @CharlieJCJ in #549
- [BFCL] Handling rate limits by @ShishirPatil in #559
- [BFCL] Fix Dataset and Possible Answer Issue by @HuanzhiMao in #557
- [BFCL] Dataset Question Fix for Executable Parallel Category by @HuanzhiMao in #568
- [BFCL] Add New Model gpt-4o-2024-08-06, gpt-4o-mini-2024-07-18 by @HuanzhiMao in #569
- [BFCL] Add New Model open-mistral-nemo-2407, open-mixtral-8x22b, open-mixtral-8x7b by @HuanzhiMao in #570
- [BFCL] Improve Warning Message when Aggregating Results by @HuanzhiMao in #517
- [BFCL] Add New Model functionary-small-v3.1, functionary-small-v3.2, functionary-medium-v3.1; Update Token Price by @HuanzhiMao in #573
- [BFCL] Set Model Temperature to 0.001 for All Models by @HuanzhiMao in #574
- [BFCL] Support Parallel Inference for Hosted Models by @HuanzhiMao in #571
- [BFCL Chore] Fix Functionary Medium 3.1 model name & add readme parallel inference by @CharlieJCJ in #577
New Contributors
- @dependabot made their first contribution in #453
- @VenkatKS made their first contribution in #437
- @pgarbacki made their first contribution in #470
- @hellovai made their first contribution in #490
- @MayankAgarwal made their first contribution in #525
- @dineshkumarsarangapani made their first contribution in #110
- @fantasist made their first contribution in #531
- @alonsosilvaallende made their first contribution in #542
Full Changelog: v0.3...v1.0
GoEx and Berkeley Function Calling Leaderboard Updates
😍 v0.3 release 🚀
Highlights
⚡️ Released GoEx: A runtime that presents abstractions for safe execution of LLM generated code, APIs, actions, etc
🏆 Updates to Berkeley Function Calling Leaderboard (aka Berkeley Tool Calling Leaderboard) : Newer models including GPT-4o, gemini-flash and 1.5-pro, Hermes-2-Pro, etc. All measured along P95 and P99 latency, and costs besides accuracy.
What's Changed
- Fix Typos in Evaluation Script and System Prompt. Identify Errors in a Dataset by @zuxin666 in #335
- BFCL April 8th Release by @HuanzhiMao in #330
- Initial goex commit by @ShishirPatil in #336
- BFCL April 9th Release (Dataset Bug Fix) by @HuanzhiMao in #338
- BFCL April 10th Release (API Sanity Check) by @HuanzhiMao in #339
- Add Support for NousResearch/Hermes-2-Pro-Mistral-7B Function Calling by @Fanjia-Yan in #327
- Update raft.py with default
p
to match paper by @ShishirPatil in #353 - GoEx Import Issues by @royh02 in #354
- BFCL April 11th Patch. Add Latency Statistics. by @HuanzhiMao in #347
- GoEx Gitignore User Credentials by @royh02 in #344
- Fix Circular Import Issue for BFCL evluation pipeline by @HuanzhiMao in #356
- Added Docker to README by @Noppapon in #355
- [Bug fix] Add Hermes-2-Pro-Mistral-7B model to UNDERSCORE_TO_DOT to parse API properly by @JasonZhu1313 in #364
- Update requirements.txt by @viniciuslazzari in #343
- Fix script argument by @ricklamers in #367
- BFCL April 16th Release by @HuanzhiMao in #366
- Log error messages from API validation by @eitanturok in #369
- Update .gitignore by @eitanturok in #370
- BFCL April 18th Release (Pipeline only) by @HuanzhiMao in #375
- Add missing argument to
OSSHandler
's_format_prompt
function by @eitanturok in #373 - Add FC + Prompt for Cohere command-r-plus by @harry-cohere in #350
- BFCL April 19th Release (Dataset & Pipeline) by @HuanzhiMao in #377
- Azure OpenAI support in raft.py by @cedricvidal in #381
- BFCL April 25th Release (New Models) by @HuanzhiMao in #386
- Colored logging configuration + displaying progress in logs by @cedricvidal in #384
- BFCL April 27th Release (Bug Fix in Cost/Latency Calculation) by @HuanzhiMao in #390
- BFCL April 28th Release (New Model: snowflake/arctic) by @Fanjia-Yan in #397
- RAFT Recovery Mode for interruptions by @kaiwen129 in #410
- Small corrections to possible_answers for simple test category by @aastroza in #405
- BFCL May 6th Release (Dataset Bug Fix) by @HuanzhiMao in #412
- RAFT DevContainer for GitHub Codespaces by @cedricvidal in #379
- RAFT Add support for configuring separate completion and embedding endpoints + pytest by @cedricvidal in #396
- RAFT Fix arbitrary code execution vulnerability in checkpoint feature by @cedricvidal in #415
- handle parallel function calls from gemini by @vandyxiaowei in #406
- RAFT Support for chat and completion model formats by @cedricvidal in #417
- [RAFT] Edit encode prompt to include
<ANSWER>:
tag in label by @kaiwen129 in #422 - [BFCL] Patch Gemini Handler by @HuanzhiMao in #421
- BFCL May 14th Release (GPT-4o and Gemini) by @Fanjia-Yan in #426
- [BFCL] update tree_sitter version in requirements.txt by @justinwangx in #433
- Fix indentation in leaderboard README by @polm-stability in #449
- Fix breaking changes due to updated Anthropic SDK by @eitanturok in #452
New Contributors
- @zuxin666 made their first contribution in #335
- @JasonZhu1313 made their first contribution in #364
- @ricklamers made their first contribution in #367
- @eitanturok made their first contribution in #369
- @harry-cohere made their first contribution in #350
- @cedricvidal made their first contribution in #381
- @aastroza made their first contribution in #405
- @vandyxiaowei made their first contribution in #406
- @justinwangx made their first contribution in #433
- @polm-stability made their first contribution in #449
Full Changelog: v0.2...v0.3
RAFT and Berkeley Function Calling Leaderboard Updates
😍 v0.2 release 🚀
Highlights
🎯 Berkeley Function Calling Leaderboard (BFCL): How do models stack up for function calling?
- Now includes latency and cost
- More open-source and closed-source models
- Bug fixes in dataset.
RAFT: Fine-tuning technique to improve LLMs for in-domain RAG!
What's Changed
- Adding APIs of 9 Google Service to API Zoo by @meenakshi-mittal in #204
- Github Actions to Maintain API Zoo Index by @ramanv0 in #188
- Adding Zoom API to API Zoo by @meenakshi-mittal in #221
- API Zoo Index Github Actions Fix by @ramanv0 in #261
- Added Google Forms API by @elva01 in #185
- RAFT + readme + small sample dataset by @kaiwen129 in #218
- Sample data for RAFT by @ShishirPatil in #264
- Docusign Additions by @dangeo773 in #194
- [Bug Fix] Fix Executable Exact Match Condition Did not Meet by @Fanjia-Yan in #251
- [Bug Fix] Fix Error in Parallel Function Possible Answer by @Fanjia-Yan in #252
- [Bug Fix] Restrict AST checker on Boolean Variable by @Fanjia-Yan in #256
- Adding 7 Oracle APIs to API Zoo by @meenakshi-mittal in #205
- Adding Datadog API to API Zoo by @meenakshi-mittal in #206
- Added Notion APIs (Block, Page, and Database) to APIZoo by @jennifer818 in #195
- removed testing code by @kaiwen129 in #281
- feat: more type annotations for the functions by @UponTheSky in #283
- [Fix] java, javascript parsers in openfunctions-v2 by @CharlieJCJ in #284
- Leaderboard Update April 1 by @HuanzhiMao in #299
- Remove Large File from
./inference
by @CharlieJCJ in #297 - Typo in raft.py by @danielfleischer in #311
- Leaderboard April 3 release by @HuanzhiMao in #309
- Support OSS Evaluation for Leaderboard by @HuanzhiMao in #318
- Update README.md by @HuanzhiMao in #320
- Fix typos by @viniciuslazzari in #323
- Correction in BFCL README instruction, fixed path in instructions by @CharlieJCJ in #329
New Contributors
- @elva01 made their first contribution in #185
- @kaiwen129 made their first contribution in #218
- @jennifer818 made their first contribution in #195
- @UponTheSky made their first contribution in #283
- @danielfleischer made their first contribution in #311
Full Changelog: v0.1...v0.2
Gorilla v0.1: OpenFunctions-v2, Berkeley Function Calling Leaderboard, and more.
😍 v0.1 release 🚀
Highlights
- 🎯 Berkeley Function Calling Leaderboard (BFCL): How do models stack up for function calling? Evaluation code for the Berkeley Function Calling Leaderboard.
- 🏆 Gorilla OpenFunctions v2: Inference examples for OpenFunctions-v2 - SoTA open-source LLM for function calling. On-par with GPT-4 🙌 Supports more languages 👌.
- API Zoo Index: An accessible collection of API documentation for humans to search through, and for LLMs to use as tools 👀
We are excited about our long due v0.1 release! Here's more:
What's Changed
- Adding BM25 and GPT retrievers by @ShishirPatil in #61
- update(anthropic): #63 to (0.3.x) by @AmirAflak in #64
- Add inference support for Macbook silicon chip by @benjaminhuo in #76
- Update README.md by @eltociear in #80
- PR for Gradio WebUI Feature ([feature] Gradio webui - #102) by @TanmayDoesAI in #105
- Update README.md by @abhi-databricks in #109
- Adds wandb to eval files by @morganmcg1 in #114
- Fix use_wandb in ast eval, responses file deletion, wandb artifacts renaming by @morganmcg1 in #115
- sentence optimization in docstring and examples by @rajveer43 in #117
- Gorilla OpenFunctions by @ShishirPatil in #142
- Example on running it locally with Hugging Face 🤗 Transformers by @Danielskry in #148
- Added Gmail api to api zoo by @saikolasani in #163
- Add Google Maps API (python client) by @felixzhu555 in #164
- Add support for the OpenWeatherMap API by @aryanvichare in #159
- Stripe Additions by @dangeo773 in #169
- Added Kubernetes Pod API and Pod Template API by @saikolasani in #170
- Quantized Gorilla by @CharlieJCJ in #160
- Add a guide on how to self-host the OpenFunctions model by @ramanv0 in #157
- Private Inference using Gorilla hosted endpoint on Replicate by @ramanv0 in #162
- added yfinance api to api zoo by @raywanb in #161
- Gorilla OpenFunctions run locally in Google Colab by @meenakshi-mittal in #166
- Fixed issue with Kubernetes Pod/Pod Template filename by @saikolasani in #198
- Create openfunctions-v2 issue template by @ShishirPatil in #203
- Add support for the ServiceNow REST API by @aryanvichare in #176
- Berkeley Function Calling Leaderboard evaluation scripts and OpenFunctions v2 inference by @ShishirPatil in #215
- [Berkeley-Function-Calling-Leaderboard] Refactor leaderboard result generation and checking by @Fanjia-Yan in #223
- Update openfunctions-v2 chatting format in README.md by @tianjunz in #239
- Update BFCL README.md by @CharlieJCJ in #241
- Local Inference script for openfunctions v2 by @ShishirPatil in #242
- [Update Gemini-1.0-Pro result checker] by @Fanjia-Yan in #245
- Update project roadmap and repository structure by @ShishirPatil in #257
New Contributors
- @AmirAflak made their first contribution in #64
- @benjaminhuo made their first contribution in #76
- @TanmayDoesAI made their first contribution in #105
- @abhi-databricks made their first contribution in #109
- @morganmcg1 made their first contribution in #114
- @rajveer43 made their first contribution in #117
- @Danielskry made their first contribution in #148
- @saikolasani made their first contribution in #163
- @felixzhu555 made their first contribution in #164
- @aryanvichare made their first contribution in #159
- @dangeo773 made their first contribution in #169
- @raywanb made their first contribution in #161
- @meenakshi-mittal made their first contribution in #166
Full Changelog: v0.0.1...v0.1
Gorilla release v0.0.1
🦍 Gorilla: An API store for LLMs 🚀
🚀 After 50,000 user requests through our hosted APIs, we are happy to tear the first release for Gorilla 💪
🤩 In this release:
💻 gorilla-cli, LLMs for your CLI!
🟢 Commercially usable, Apache 2.0 licensed Gorilla models
🚀 CLI interface to chat with Gorilla!
🚀 Torch Hub and TensorFlow Hub Models!
🚀 The first Gorilla model! Colab or 🤗!
🔥 APIZoo contribution guide for community API contributions!
🔥 APIBench dataset and the evaluation code of Gorilla!