Remove binary state from high-level API and use Jinja templates #3147

cebtenzzre · 2024-10-28T15:52:16Z

This PR replaces GPT4All's templates that use QString.arg to format a user/assistant pair, with a HuggingFace-style Jinja template that formats the entire conversation.

This allows more fine-grained control of how the extra data (such as LocalDocs context) is passed to the model (for {# version 1#} templates).

- Python bindings use `jinja2` - server.cpp is not implemented - chatapi.cpp is not implemented Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Signed-off-by: Adam Treat <treat.adam@gmail.com>

This reverts commit 15e8fba.

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

We still need this for models that don't include bos_token in their chat template. Llama 3.1 8B Instruct sets this to false. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Importantly, the non-chat completions endpoint (`/v1/completions`) no longer uses a system prompt or LocalDocs, as those are not applicable. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Before this PR, GPT4All inserted a system message into non-chat completions, and it attempted to use LocalDocs with them. It no longer does either of these things because they do not make sense here. This changes the output slightly, so the test needs to be updated. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Alpaca is obsolete. We would be much better off defaulting to the chat template that is shipped with the model. The default system prompt exists in order to demonstrate how it should be templated with Alpaca, but this is no longer applicable because this is now part of the Jinja template. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

As a bonus, system prompts can now be used with remote models. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

The new tool calling PR shows that we can achieve the desired UI with the same adjacent prompt/response pairs that we have now, by having child items instead of items in between. This assumption allows us to simplify the on-disk state by always computing the peer index lazily. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

manyoso · 2024-11-23T15:20:58Z

This is what I see when installing llama 3.2 models

And when I click on the documentation I see

Also, it doesn't build for me by default because you're using 'cbegin' and 'cend' in chatllm.cpp for the std::span

gpt4all-chat/src/chatllm.cpp

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

This subrange is already const, so C++23 cbegin/cend isn't necessary. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

The 'set tools = none' line is not recognized by Jinja2Cpp. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre · 2024-11-23T19:45:40Z

And when I click on the documentation I see [image: broken page]

You won't see the new documentation page until this PR is merged. If you run the doc site locally with mkdocs serve, you will see that the Chat Templates page is at the expected URL.

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre force-pushed the remove-binarystate branch from 5cc501c to 6400a7e Compare October 31, 2024 15:46

manyoso force-pushed the remove-binarystate branch from 19e772f to 5c3d83d Compare November 2, 2024 17:21

cebtenzzre force-pushed the remove-binarystate branch from 2be19f2 to d2b6ba3 Compare November 5, 2024 17:16

cebtenzzre and others added 27 commits November 7, 2024 10:43

initial implementation of Jinja templates

840673b

- Python bindings use `jinja2` - server.cpp is not implemented - chatapi.cpp is not implemented Signed-off-by: Jared Van Bortel <jared@nomic.ai>

add back trimming of trailing whitespace

78c214f

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

separate applyJinjaTemplate from chat model

2a273eb

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

implement chat naming and follow-up questions

65db824

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

fix chat save/restore

904adcf

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

implement the regenerate button

0e84395

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

add TODO comment

e4cdd17

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

fix off-by-one error

dac23a5

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

implement system prompt and fix equality check

5365b79

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

gcc 12 apparently does not include the cend method for std::span.

70da368

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Remove extra information from jinja.

4b6044b

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Revert "Remove extra information from jinja."

2fa8f82

This reverts commit 15e8fba.

Update the sources in chatllm directly and make it thread safe to do so.

d7fa1c7

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Add a fixme for a potential crasher.

aeb3aa1

Signed-off-by: Adam Treat <treat.adam@gmail.com>

remove unused EOS token from params, fix hardcoded BOS token

5eb10c2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

jinja helpers: fix key casing

91f7988

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

llamamodel: restore add_special=true

a2507da

We still need this for models that don't include bos_token in their chat template. Llama 3.1 8B Instruct sets this to false. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

get the local server working again

7c154d2

Importantly, the non-chat completions endpoint (`/v1/completions`) no longer uses a system prompt or LocalDocs, as those are not applicable. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

tests: simplify comparison

aacf1c8

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

server: fix missing 'else'

ccd91b6

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

llamamodel: remove debugging offset

56e7140

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

do not generate name or follow-up questions for server chats

0dbb486

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

remove views::enumerate for GCC 12 compat

3265025

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

chatapi: implement Jinja output parsing

f462b6a

As a bonus, system prompts can now be used with remote models. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

chatllm: fix prompt error handling

402e1b2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre added 11 commits November 21, 2024 13:08

python: add compatibility with new models3.json

23aff08

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

qml: s/framing/control/

9a2f327

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

WIP: interaction buttons on chat messages

89ae98e

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

implement redo of earlier responses

689b056

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

implement editing of aribitrary prompts

b40ec78

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

ask for confirmation before using edit to clear most of chat

7835979

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

also ask for confirmation when resetting chat

3e9f1bf

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

docs: fix bulleted lists

c1340de

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

docs: improve explanation

8c75848

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

replace the pencil (edit) icon

74b0e81

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

manyoso requested changes Nov 23, 2024

View reviewed changes

gpt4all-chat/src/chatllm.cpp Outdated Show resolved Hide resolved

cebtenzzre added 3 commits November 23, 2024 14:31

fix spelling errors

f1c966f

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

fix GCC 12 compat

76ae72e

This subrange is already const, so C++23 cbegin/cend isn't necessary. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

override Llama 3.2 templates

87e8872

The 'set tools = none' line is not recognized by Jinja2Cpp. Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre requested a review from manyoso November 23, 2024 19:46

manyoso added 2 commits November 23, 2024 21:15

Fix chat template for llama 3.2 models.

31ef065

Signed-off-by: Adam Treat <treat.adam@gmail.com>

Another small fix for llama 3.2 chat templates.

0be296c

Signed-off-by: Adam Treat <treat.adam@gmail.com>

manyoso approved these changes Nov 25, 2024

View reviewed changes

manyoso merged commit 225bf6b into main Nov 25, 2024
3 of 18 checks passed

cebtenzzre mentioned this pull request Dec 2, 2024

Fixups for Jinja PR #3215

Merged

cebtenzzre added a commit that referenced this pull request Dec 5, 2024

add changelog entries for #3147

7a46549

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre mentioned this pull request Dec 5, 2024

add changelog entries for Jinja PR #3223

Merged

cebtenzzre added a commit that referenced this pull request Dec 6, 2024

changelog: add more changes from #3147

062fb79

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

manyoso pushed a commit that referenced this pull request Dec 6, 2024

changelog: add more changes from #3147 (#3226)

6b18abb

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

This was referenced Dec 10, 2024

Local server not remembering previous messages #2602

Open

Fix local server regressions caused by Jinja PR #3256

Open

modellist: fix cloning of chat template and system message #3262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove binary state from high-level API and use Jinja templates #3147

Remove binary state from high-level API and use Jinja templates #3147

cebtenzzre commented Oct 28, 2024 •

edited

Loading

manyoso commented Nov 23, 2024 •

edited

Loading

cebtenzzre commented Nov 23, 2024 •

edited

Loading

Remove binary state from high-level API and use Jinja templates #3147

Remove binary state from high-level API and use Jinja templates #3147

Conversation

cebtenzzre commented Oct 28, 2024 • edited Loading

manyoso commented Nov 23, 2024 • edited Loading

cebtenzzre commented Nov 23, 2024 • edited Loading

cebtenzzre commented Oct 28, 2024 •

edited

Loading

manyoso commented Nov 23, 2024 •

edited

Loading

cebtenzzre commented Nov 23, 2024 •

edited

Loading