Add inverse chat templating #33321

Rocketknight1 · 2024-09-05T13:30:03Z

Very experimental PR for now! This PR adds inverse chat templating, where we convert a formatted chat back into message dicts in a universal format. This has been requested several times, and it's critical to allow seamless tool use in pipelines, without requiring users to manually write parsers for each model.

Inverse templating requires the ability to extract text from a large input string. After testing several templates, I found that we can generally handle this by exposing two functions in the inverse templating environment: a slightly modified re.finditer() and a totally unmodified json.loads()

TODO:

Make some model PRs to test with this, but don't merge yet! (PRs not open, but templates written)
Add tests for loading/saving of inverse templates
Add tests for inverse template function
Test recovery of tools as well as messages
Make sure I don't need any extra functions
Refactor chat template tests out of tokenization_common so they're not run for every model
Add chat template tests to CircleCI
~~Make sure extraction works correctly with generation, and add some tests that use (static!) generation outputs~~
~~Add tool use to pipelines~~ (will put this in a separate PR)

HuggingFaceDocBuilderDev · 2024-09-05T13:49:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LysandreJik · 2024-09-21T01:52:56Z

Thanks @Rocketknight1! I'll review ASAP

Rocketknight1 · 2024-09-23T13:06:46Z

@LysandreJik Don't worry, it's not quite ready! I'm still working on the little details of how this interacts with generation / incomplete inputs, etc.

Rocketknight1 · 2024-09-24T15:13:42Z

Update @LysandreJik, this is now ready for review! The last issue I was testing was how the inverse templates would handle incomplete prompts - for example, if the user called model.generate(), but only passed the new tokens to the inverse template, instead of the entire prompt + output.

My conclusion was that this is just very messy and impossible to handle correctly - in particular, the prompt often contains the assistant header, so it's extremely hard for the inverse template to figure out what's going on if you only pass the generation output. Therefore, I think we should only support applying inverse templates to entire chats, and not single generation outputs, and so there's no need to test it with generation outputs only.

LysandreJik · 2024-09-24T15:26:35Z

Awesome, will take a look ASAP

CISC · 2024-09-25T07:53:36Z

Shouldn't this support multiple (tool_use/rag) templates just like chat templates?

In the cases where it makes sense not to make a unified chat template I'd imagine it won't make sense to make a unified inverse template as well...

Rocketknight1 · 2024-09-25T13:04:44Z

We might consider adding that in future! In general, we're trying to move away from having multiple chat templates, in favour of a single template that just uses conditions like {% if tools %} to switch its behaviour, although we'll continue supporting multiple (non-inverse) templates indefinitely.

LysandreJik · 2024-10-03T09:21:46Z

Looks like a simple enough PR for me; would you like to check with others tha are using chat templates extensively like @xenova or @Narsil to get a review?

Rocketknight1 · 2024-10-03T13:06:20Z

Sure, I'll ping them! cc @lewtun @xenova @Narsil and @zucchini-nlp, since they've been working with templates (or have requested this feature).

zucchini-nlp · 2024-10-03T15:47:49Z

From VLM side I dont think this feature will be very useful, as I don't see and easy way to implement that if the input text has already been processed (processor.call()) and then detokenized back. Saying because I guess it will be used to format back to dict after generating text

But I think it's okay as VLMs are one step behind and don't support full potential of chat templates yet

CISC · 2024-10-04T09:33:54Z

src/transformers/utils/chat_template_utils.py

+            "apply_chat_template requires jinja2>=3.1.0 to be installed. Your version is " f"{jinja2.__version__}."
+        )
+
+    jinja_env = SandboxedEnvironment(trim_blocks=True, lstrip_blocks=True, extensions=[jinja2.ext.loopcontrols])


I may be missing something here, but do you really need a mutable sandbox? Your testing template doesn't look like it does...

CISC · 2024-10-04T09:37:55Z

tests/utils/test_chat_template_utils.py

+{%- set tools = finditer("\[AVAILABLE_TOOLS\] (.*?)\[\/AVAILABLE_TOOLS\]", chat, flags=16) %}
+{%- set user_messages = finditer('(?:\[INST\] )(.+?)\[\/INST\]', chat, flags=16, add_tag="user") %}
+{%- set asst_messages = finditer('(?:\[\/INST\]|\[\/TOOL_RESULTS\]) (.+?)<\/s>', chat, flags=16, add_tag="assistant") %}
+{%- set available_tools = finditer('\[AVAILABLE_TOOLS\] (.*?)\[\/AVAILABLE_TOOLS\]', chat, flags=16, add_tag="available_tools") %}
+{%- set tool_calls = finditer('\[TOOL_CALLS\] (.+?\])<\/s>', chat, flags=16, add_tag="tool_calls") %}
+{%- set tool_results = finditer('\[TOOL_RESULTS\] (.+?)\[\/TOOL_RESULTS\]', chat, flags=16, add_tag="tool") %}


Since you're exposing regex flags to the template anyway should not all these use flags=DOTALL instead of a magic number?

CISC · 2024-10-09T16:11:06Z

@Rocketknight1 BTW, really looking forward to this feature, it will be super useful.

I'm working on a chat template editor (Gradio space) which will make it super easy to view, modify, test and create PRs for chat templates on HF. I'm hoping to add support for inverse templates as well, but it will require the HF API to return the inverse template in the config.tokenizer_config entry in ModelInfo, who can ensure the API is updated once this feature is added?

Rocketknight1 · 2024-10-09T17:04:17Z

Hi @CISC - in that case, I'd pay attention to this PR too! #33957

We're planning a medium-term refactor because of how much chat templates have blown up, with the goal of eventually moving them to a single file in most model repos.

(I may delay this PR until we settle on a spec in #33957, to avoid putting inverse templates somewhere and then immediately having to move them again)

CISC · 2024-10-09T20:40:20Z

(I may delay this PR until we settle on a spec in #33957, to avoid putting inverse templates somewhere and then immediately having to move them again)

That PR seems like it needs to be very gradually phased in though, it will impact a lot of projects that need to catch up, does it really make sense to delay this one until that is done?

Rocketknight1 · 2024-10-10T13:12:50Z

@CISC good point, but we'll probably merge that PR relatively soon so that Transformers starts to support the new format, even though we won't actually transition repos until later!

…y class

Rocketknight1 force-pushed the reverse_templating branch from 2777d17 to 65543a2 Compare September 6, 2024 12:32

Rocketknight1 force-pushed the reverse_templating branch from dd3ae44 to 0bda395 Compare September 16, 2024 14:58

Rocketknight1 marked this pull request as ready for review September 16, 2024 14:59

Rocketknight1 force-pushed the reverse_templating branch from 0bda395 to b90f2d6 Compare September 20, 2024 16:55

Rocketknight1 force-pushed the reverse_templating branch from b90f2d6 to ece4e6a Compare September 24, 2024 14:45

Rocketknight1 requested a review from LysandreJik September 24, 2024 15:13

CISC mentioned this pull request Sep 25, 2024

Add inverse chat template metadata ggerganov/llama.cpp#9637

Draft

2 tasks

CISC reviewed Oct 4, 2024

View reviewed changes

Rocketknight1 added 8 commits October 14, 2024 18:08

Initial commit

47a3ca5

Fix imports

d9f478f

Refactor chat template tests out so they run once, instead of on ever…

481109b

…y class

Check tests are still being run

07a0a75

Check tests are still being run

eec1c20

Check tests are still being run

6fd52c6

One more try!

550a6f1

make fixup

39b9779

Rocketknight1 added 7 commits October 14, 2024 18:08

make fixup

b51b330

push todo

d6b7914

Add reverse templating for the tools list as well

fc51e7a

Check tests are being run in CI

fc44825

Tests are run in the CI!

7676abd

Inverse template save-loading in separate file

fd6f269

Update save-load test

b4406a2

Rocketknight1 force-pushed the reverse_templating branch from 1c6fd3e to b4406a2 Compare October 14, 2024 17:08

fix tests

73e0b95

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inverse chat templating #33321

Add inverse chat templating #33321

Rocketknight1 commented Sep 5, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 5, 2024

LysandreJik commented Sep 21, 2024

Rocketknight1 commented Sep 23, 2024

Rocketknight1 commented Sep 24, 2024

LysandreJik commented Sep 24, 2024

CISC commented Sep 25, 2024

Rocketknight1 commented Sep 25, 2024

LysandreJik commented Oct 3, 2024

Rocketknight1 commented Oct 3, 2024

zucchini-nlp commented Oct 3, 2024

CISC Oct 4, 2024

CISC Oct 4, 2024

CISC commented Oct 9, 2024

Rocketknight1 commented Oct 9, 2024 •

edited

Loading

CISC commented Oct 9, 2024

Rocketknight1 commented Oct 10, 2024

Add inverse chat templating #33321

Are you sure you want to change the base?

Add inverse chat templating #33321

Conversation

Rocketknight1 commented Sep 5, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Sep 5, 2024

LysandreJik commented Sep 21, 2024

Rocketknight1 commented Sep 23, 2024

Rocketknight1 commented Sep 24, 2024

LysandreJik commented Sep 24, 2024

CISC commented Sep 25, 2024

Rocketknight1 commented Sep 25, 2024

LysandreJik commented Oct 3, 2024

Rocketknight1 commented Oct 3, 2024

zucchini-nlp commented Oct 3, 2024

CISC Oct 4, 2024

Choose a reason for hiding this comment

CISC Oct 4, 2024

Choose a reason for hiding this comment

CISC commented Oct 9, 2024

Rocketknight1 commented Oct 9, 2024 • edited Loading

CISC commented Oct 9, 2024

Rocketknight1 commented Oct 10, 2024

Rocketknight1 commented Sep 5, 2024 •

edited

Loading

Rocketknight1 commented Oct 9, 2024 •

edited

Loading