Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation error using functionary #1314

Closed
4 tasks done
heisters opened this issue Mar 29, 2024 · 3 comments
Closed
4 tasks done

Validation error using functionary #1314

heisters opened this issue Mar 29, 2024 · 3 comments

Comments

@heisters
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

When I submit a chat request with one tool choice in the body, but the chat message does not make the model choose a tool, I expect the server to return a successful response with an empty tool_calls array.

Current Behavior

It throws 3 validation errors and returns a 500 error to the client. (See below)

Environment and Context

MacOS 14.3.1, Apple M1 Max CPU with 64GB memory.

$ python3 --version
Python 3.11.0

XCode 15.3

llama-cpp-python installed with:

CMAKE_ARGS="-DLLAMA_METAL_EMBED_LIBRARY=ON -DLLAMA_METAL=on" pip install git+https://github.com/abetlen/llama-cpp-python.git --no-cache-dir --force-reinstall

Failure Information (for bugs)

Exception: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

Steps to Reproduce

  1. Start up server with:
python3 -m llama_cpp.server \
  --model models/functionary-medium-v2_2-q4_0/functionary-medium-v2.2.q4_0.gguf \
  --chat_format functionary-v2 \
  --hf_pretrained_model_name_or_path models/functionary-medium-v2_2-q4_0/ \
  --n_gpu_layers 1
  1. POST the following request to /v1/chat/completions
{"model":"meetkai/functionary-medium-v2.2","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"say hello"}],"temperature":0.7,"tools":[{"type":"function","function":{"name":"GetDirections","description":"Provides on screen directions.","parameters":{}}}],"tool_choice":"auto"}

If I post a similar message with no tools parameter, the server successfully responds.

Failure Logs

llama_print_timings:        load time =   11453.19 ms
llama_print_timings:      sample time =       0.29 ms /     3 runs   (    0.10 ms per token, 10380.62 tokens per second)
llama_print_timings: prompt eval time =   11452.99 ms /   153 tokens (   74.86 ms per token,    13.36 tokens per second)
llama_print_timings:        eval time =     219.31 ms /     2 runs   (  109.66 ms per token,     9.12 tokens per second)
llama_print_timings:       total time =   11879.75 ms /   155 tokens
Llama.generate: prefix-match hit

llama_print_timings:        load time =   11453.19 ms
llama_print_timings:      sample time =       0.89 ms /    10 runs   (    0.09 ms per token, 11248.59 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =    1011.59 ms /    10 runs   (  101.16 ms per token,     9.89 tokens per second)
llama_print_timings:       total time =    1035.41 ms /    11 tokens
Exception: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/contextlib.py", line 222, in __aexit__
    await self.gen.athrow(typ, value, traceback)
  File "/opt/homebrew/anaconda3/envs/llama/lib/python3.11/site-packages/fastapi/concurrency.py", line 35, in contextmanager_in_threadpool
    raise e
fastapi.exceptions.ResponseValidationError: 3 validation errors:
  {'type': 'list_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'tool_calls'), 'msg': 'Input should be a valid list', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/list_type'}
  {'type': 'dict_type', 'loc': ('response', 'typed-dict', 'choices', 0, 'message', 'function_call'), 'msg': 'Input should be a valid dictionary', 'input': None, 'url': 'https://errors.pydantic.dev/2.6/v/dict_type'}
  {'type': 'string_type', 'loc': ('response', 'str'), 'msg': 'Input should be a valid string', 'input': {'id': 'chatcmpl-a56acfc6-30fc-4624-9bbb-e32bcf931207', 'object': 'chat.completion', 'created': 1711744793, 'model': 'meetkai/functionary-medium-v2.2', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'Hello! How can I assist you today?', 'function_call': None, 'tool_calls': None}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 156, 'completion_tokens': 10, 'total_tokens': 166}}, 'url': 'https://errors.pydantic.dev/2.6/v/string_type'}

INFO:     ::1:56668 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

Git HEAD at time of install: 1e60dba

Environment info:

llama-cpp-python$ python3 --version
Python 3.11.0

llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
fastapi            0.110.0
numpy              1.26.4
sse-starlette      2.0.0
uvicorn            0.29.0
@FutureProofHomes
Copy link

Nailed it. Thank you for opening this bug. I have temporarily overcome this bug by hacking the ChatCompletionResponseMessage class in llama_types.py to be:

class ChatCompletionResponseMessage(TypedDict):
    content: Optional[str]
    tool_calls: NotRequired[Optional["ChatCompletionMessageToolCalls"]]
    role: Literal["assistant", "function"]  # NOTE: "function" may be incorrect here
    function_call: NotRequired[Optional[ChatCompletionResponseFunctionCall]]  # DEPRECATED

I don't think this is the right solution, and it has some downstream negative effects with tools that require strict adherence to the openAI schema.

I've opened a related ticket in the meetkai/Functionary project, but the bug does belong here.

@heisters
Copy link
Author

heisters commented Apr 1, 2024

Confirmed, that change works around the issue for me. Thanks!

abetlen added a commit that referenced this issue Apr 5, 2024
@abetlen abetlen closed this as completed in 1ae3abb Apr 5, 2024
@abetlen
Copy link
Owner

abetlen commented Apr 5, 2024

I've updated the functionary chat formats to fix the response, should be in the next release.

xhedit pushed a commit to xhedit/llama-cpp-conv that referenced this issue Apr 6, 2024
xhedit added a commit to xhedit/llama-cpp-conv that referenced this issue Apr 6, 2024
* feat: add support for KV cache quantization options (abetlen#1307)

* add KV cache quantization options

abetlen#1220
abetlen#1305

* Add ggml_type

* Use ggml_type instead of string for quantization

* Add server support

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>

* fix: Changed local API doc references to hosted (abetlen#1317)

* chore: Bump version

* fix: last tokens passing to sample_repetition_penalties function (abetlen#1295)

Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>

* feat: Update llama.cpp

* fix: segfault when logits_all=False. Closes abetlen#1319

* feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (abetlen#1247)

* Generate binary wheel index on release

* Add total release downloads badge

* Update download label

* Use official cibuildwheel action

* Add workflows to build CUDA and Metal wheels

* Update generate index workflow

* Update workflow name

* feat: Update llama.cpp

* chore: Bump version

* fix(ci): use correct script name

* docs: LLAMA_CUBLAS -> LLAMA_CUDA

* docs: Add docs explaining how to install pre-built wheels.

* docs: Rename cuBLAS section to CUDA

* fix(docs): incorrect tool_choice example (abetlen#1330)

* feat: Update llama.cpp

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 abetlen#1314

* fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes abetlen#1328 Closes abetlen#1314

* feat: Update llama.cpp

* fix: Always embed metal library. Closes abetlen#1332

* feat: Update llama.cpp

* chore: Bump version

---------

Co-authored-by: Limour <93720049+Limour-dev@users.noreply.github.com>
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
Co-authored-by: lawfordp2017 <lawfordp@gmail.com>
Co-authored-by: Yuri Mikhailov <bitsharp@gmail.com>
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants