Releases: ianarawjo/ChainForge
v0.1.4: Failure Progress, Inspect popup, Firefox support
This release includes the following features:
Selective Failure on API requests ♨️
ChainForge now has selective failure on PromptNodes
: API calls that fail no longer stop the remaining requests, but rather collect in red error bars within the progress bars:
progress-errors.mov
An error message will display all errors once all API requests return (whether successfully or with errors). This saves $$ and time. (As always, ChainForge cache's responses the moment it receives them, so you don't need to worry about re-running prompt nodes re-calling APIs.)
Inspector Pop-up 🔍
In addition, we've added an Inspector pop-up which you can access by clicking the response preview box on a PromptNode
:
popup-inspector.mov
This makes it much easier to inspect responses without needing to attach a dedicated Inspect Node. We're going to build this out (and add it to the EvaluatorNode
) soon, but for now I hope you find this feature useful.
LLM Color Consistency 🌈
Now, each LLM you create has a dedicated color that remains consistent across VisNode
plots and Inspector responses.
Firefox Support 🦊
Due to demand for more browsers, we've added support for FireFox. This involved a minor change to how model settings forms work.
As well, (though it isn't formatted exactly right) other browsers should now work too, as we removed a dependency on Regex lookaheads/behinds which was causing some browsers like Safari to not load the app at all.
Website
As an aside, we've created a website at chainforge.ai. It's not much yet, but it's a start. We will add tutorials in the near future for new users.
Upcoming features
Major priorities right now are:
- Tabular data nodes: Load tabular data and reference columns in
EvaluatorNode
code - Ground truth example flows: An example flow that evaluates responses against a 'ground truth' which differs per prompt parameter value
- Azure support: Yes, we heard you! :) I am hoping to get this very soon.
v0.1.3.1: Example Flow Pane
Make it very easy to import example flows:
example-flows.mov
Other additions:
- Added a "Compare System Prompts" example, with one-shot versus zero-shot versus "threaten a fictional kitten" examples. (See Twitter thread that inspired this use case: https://twitter.com/ShriramKMurthi/status/1664978520131477505?s=20 )
- Made it possible to switch between GPT3.5 and GPT4 after initially adding one of them (removed separation between these types of models)
- Improved the 'add node' UI to look sleeker
v0.1.3: Model Settings (+ more models)
Proud to announce we now have model settings in ChainForge. 🥳
You can now compare across different versions of the same model, in addition to nicknaming models and choosing more specific models.
To install, do pip install chainforge --upgrade
. Full changelog below.
More supported models 🤖
Along with model settings, we now have support for all OpenAI, Anthropic, Google PaLM (chat and text), Dalai-hosted models. For instance, you can now compare Llama.65B to PaLM text completions, if you were so inclined. For the full list, see models.py.
Here is comparing Google PaLM's text-bison to chat-bison for the same prompt:
Customizable model settings (and emojis! 😑)
Once you add a model to a PromptNode
, now you can tap the 'settings' icon on a PromptNode
to bring up a form with all settings for that base model. You can adjust the exact model used (for instance, text-bison-001
in PaLM, or Dalai-hosted llama.30B
):
Temperature appears next to model names by default. For ease of reference, temperature is displayed on a sliding color scale from cyan #00ffff
(coldest) to violet #ff00ff
(lukewarm) to red #ff0000
(hottest). The percentage respects min and max temperature settings for individual models.
You can now also nickname models in PromptNode
s. Names must be unique. Each nickname will appear elsewhere in Chainforge (e.g. in plots). You can also set the Emoji used. For instance, here is a comparison between two ChatGPT models at different temperatures, which I've renamed hotgpt
and coldgpt
with the emojis 🔥 and 🥶:
Note about importing previous flows
Unfortunately, this code rewrite involved a breaking change for how flows are imported and exported (.cforge
file format). You may still be able to import old flows, but you need to re-populate each model list and re-query LLMs. I hope to avoid this, but in this case it was necessary to store model settings information and redo how the backend cache's responses.
Note about Dalai-hosted models
Currently, you cannot query multiple Dalai models/settings at once, since a locally run model can only take one request at a time. We're working on fixing this for the next minor release; for now, just choose one model at a time, and if you want more than one, add it to the list and re-query the prompt node (it will use the previously cache'd responses from the first Dalai model).
Encounter any bugs?
There was a lot to change for this release, and it's likely that at least one thing broke in the process that we haven't detected. If you encounter a bug or problem, open an Issue or respond to the Discussion about this release! 👍
Minor UI fixes to InspectNode variable headers and template variable Badges
This release contains two minor but important changes:
- Template variable hooks in
PromptNode
andTextFieldNode
no longer auto-uppercase the template variable name by default:
- Selected variables in
InspectNode
now display up to 144 characters of their value (which is much more informative than the previous 12 characters in uppercase):
We are working on custom model settings for the next major release (0.1.3), so you can change the temperature and other settings of individual models. We are structuring the code to be easily extensible to add more (hopefully much more) models in the future, including user-specified models and settings forms (through react-json-schema
).