-
Notifications
You must be signed in to change notification settings - Fork 836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] avoid second round trip for function call #656
[FEATURE] avoid second round trip for function call #656
Conversation
d57914f
to
78e24fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Grogdunn,
I believe we are getting closer and this will be very useful function calling feature.
Still I believe that the CompleteRoundTripBox
and the runtime detection of needCompleteRoundTrip
are unnecessary.
Since in the FunctionCallbackContext
you have already detected Void
response (and Consumer
) functions it should be possible add new boolean noReturnFunction()
method to the FunctionCallback
interface.
Then in the AbstractFunctionCallSupport#handleFunctionCallOrReturn
you should be able to use this to decide how to handle the return?
This way you can implement everything inside the abstract support classes without changing the Model implementations?
Maybe I'm missing some detail?
Let me know if you would be interested to explore the suggested approach? or should I give it a try.
Thank you very much for the contribution!
...ai-azure-openai/src/main/java/org/springframework/ai/azure/openai/AzureOpenAiChatClient.java
Outdated
Show resolved
Hide resolved
...ai-azure-openai/src/main/java/org/springframework/ai/azure/openai/AzureOpenAiChatClient.java
Show resolved
Hide resolved
@Grogdunn i'm afraid you've committed many changes unrelated to this PR. To prevent such situations, please don't reformat the project code. Especially code not related to the PR. |
5948013
to
87308e2
Compare
@tzolov removed commit with all import * fixed and re-played only for touched subproject. conflicts are gone! 🎉 |
Thanks @Grogdunn. |
Sure! Today I'll try to reduce PR changeset. |
87308e2
to
5edafc9
Compare
Ok, squashed, rebased onto main, cleaned up. |
Hi @Grogdunn , First, I noticed that the PR doesn't support streaming function calling and while reasoning about ways to provide such support I realised that we are on a wrong path. Basically we can not safely abruptly stop the function calling message exchange and leave the conversation in a safe state. Simply we don't know what other actions the LLM might need to perform before safely answering the user prompt. So IMO the only way this can ever work is if the LLM Function Calling API provide a protocol to send Please let me know what do you think. |
🤔 I think that is possible because before call local function we wait for streaming-stop signal (I don't remeber the correct terms), then the stream stops and we call the local functions. I'll try to spike that today. |
@Grogdunn , the problem is not related to the streaming but to the sync calls as well. If you do not return a value you break the conversation and prevent the LLM to do some follow up function calls. |
Well in terms of conversation you are right. But you can use LLM to make something else instead of "chatbot". Some of my customers need to grab structured data from unstructured data, like name, surname, other information from emails, attachments, extract information from datasheet ad so on... A simple prompt can be, for instance:
the function is described as usual. No other interaction with this data will be done in the future. |
I add, in my case, the second roundtrip is useless, and make double the costs for any interaction. |
I understand your use case and the goal to reduce the cost. At the same time i've reworked our PR to returns "DONE" in case of void function definition, which is still a valid user case and modified the test like this: This seems to work (most of the time) for sync and stream calls. |
The "done" message is exactly how we have addressed the "second round trip" at the moment. I think, if I've understood well, the Structured Output Converter is not enough because when you extract data and call the functions the LLM better respects the data structure, while the other way around (parse the text in output and grab the JSON provided) is more prone to hallucinations (I've used some time ago before functions/tools era). So the better way is to think an extension point to leave the hack external to te spring-ai. |
While doing triage for the M1 release, we will move this issue to the M2 release. There is a Pr that @tzolov will submit that takes this discussion further. Thanks for your patience. |
I've done some experiment with structured output, and with GPT-4o simply works. But remain hallucination prone. Maybe that feature is not necessary anymore? |
I'm afraid I have to bump this again, apologies. |
@Grogdunn I think that 5017749 might help addressing this issue? |
@tzolov Thanks for the hint! I see the intent of this commit and if I understand well is what I need! I try to use as soon as possible |
Going to close this issue. Please open a new one should the current feature set not work out. Thanks @Grogdunn |
Reefer issue #652
For some use case after local function is called no need to pass the results to LLMs. (eg: data extraction from documents)