We've made a number of improvements to the inspector UI and beyond.

Side-by-side comparison across LLM responses

Responses now appear side-by-side for up to five LLMs queried:

Collapseable response groups

You can also collapse LLM responses grouped by their prompt template variable, for easier selective inspection. Just click on a response group header to show/hide:

collapsable-groups.mov

Accuracy plots by default

Boolean (true/false) evaluation metrics now use accuracy plots by default. For instance, for ChainForge's prompt injection example:

This makes it extremely easy to see differences across models for the specified evaluation. Stacked bar charts are still used when a prompt variable is selected. For instance, here is plotting a meta-variable, 'Domain', across two LLMs, testing whether or not the code outputs had an import statement (another new feature):

Added 'Inspect results' footer to both Prompt and Eval nodes

The tiny response previews footer in the Prompt Node has been changed to 'Inspect Responses' button that brings up a fullscreen response inspector. In addition, evaluation results can be easily inspected by clicking 'Inspect results':

Evaluation scores appear in bold at the top of each response block:

In addition, both Prompt and Eval nodes now load cache'd results upon initialization. Simply load an example flow and click the respective Inspect button.

Added `asMarkdownAST` to `response` object in Evaluator node

Given how often developers wish to parse markdown, we've added a function asMarkdownAST() to the ResponseInfo class that uses the mistune library to parse markdown as an abstract syntax tree (AST).

For instance, here's code which detects if an 'import' statement appeared anywhere in the codeblocks of a chat response:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.7: UI improvements to response inspector

Side-by-side comparison across LLM responses

Collapseable response groups

Accuracy plots by default

Added 'Inspect results' footer to both Prompt and Eval nodes

Added `asMarkdownAST` to `response` object in Evaluator node

v0.1.7: UI improvements to response inspector

Side-by-side comparison across LLM responses

Collapseable response groups

Accuracy plots by default

Added 'Inspect results' footer to both Prompt and Eval nodes

Added asMarkdownAST to response object in Evaluator node

Added `asMarkdownAST` to `response` object in Evaluator node