Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When API response reaches token limit detect that and render a “Continue” button on the front end #403

Open
krschacht opened this issue Jun 11, 2024 · 0 comments
Milestone

Comments

@krschacht
Copy link
Contributor

krschacht commented Jun 11, 2024

It's interesting that you notice a difference between ChatGPT and HostedGPT in this regard, but it's plausible that the algorithm for managing history is different. I actually did something really naive and intended to go back and optimize it at some point but I never did. It's right here: https://github.com/AllYourBot/hostedgpt/blob/main/app/services/ai_backend/open_ai.rb#L69

First, the max_tokens should really be:

max_length_of_response_for_good_user_experience = 3000  # hard coded value we can tweak
[ input_tokens + max_length_of_response_for_good_user_experience,  context_limit_of_model ].min

I even added the Tiktoken gem to the project to prepare for doing accurate token counting but haven't addressed this. In addition, I also never got around to truncating history. It looks like preceding_messages is always returning all preceding messages. If I'm reading the code correctly, the preceding messages should eventually exceed the models context length and start erroring out. This needs to be fixed at some point. The method to get preceding messages should be:

preceding_messages_up_to_max_tokens_of(max_input_tokens_allowed)

(I'm naively naming these things just for pseudo-code purposes)

@krschacht krschacht modified the milestones: 0.7, 0.8 Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant