Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Input/Output Length Parameters for gpt-4-1106-preview and gpt-4-0125-preview Models #533

Closed
MBaltz opened this issue Feb 8, 2024 · 2 comments · Fixed by #535

Comments

@MBaltz
Copy link

MBaltz commented Feb 8, 2024

I'm not sure if the guide and the actual code match up, especially about how much data the gpt-4-1106-preview and gpt-4-0125-preview models can handle. The guide says both models can deal with the same amount of data at once. But, looking at the code, it seems there's a difference in their settings.

'gpt-4-1106-preview': 128000,
'gpt-4-0125-preview': 4096,


Version / Description / Context

gpt-4-0125-preview
Description: The latest GPT-4 model intended to reduce cases of “laziness” where the model doesn’t complete a task. Returns a maximum of 4,096 output tokens.
Context window: 128,000 tokens

gpt-4-1106-preview
Description: GPT-4 Turbo model featuring improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Returns a maximum of 4,096 output tokens. This preview model is not yet suited for production traffic.
Context window: 128,000 tokens

Reference:
https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo

@MBaltz
Copy link
Author

MBaltz commented Feb 8, 2024

This PR approach that:
#525

@kaz-on
Copy link

kaz-on commented Feb 9, 2024

I'm interested in this issue.

This value seems to have been discussed in #521.
And I encountered the same problem as @almagest21's comment (#521 (comment)) regarding "Max Token" when using gpt-4-0125-preview.

It appears there's confusion due to a discrepancy between what "Max Token" is described as and how it's actually utilized.

"Max Token" is described in this application as "The maximum number of tokens to generate in the chat completion."

"label": "Max Token",
"description": "The maximum number of tokens to generate in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length."

However, in practice, the max_tokens parameter in the API calls is set to undefined here:

max_tokens: undefined,

and here:
max_tokens: undefined,

Instead, "Max Token" is utilized as a parameter for limitMessageTokens function, which is meant to limit the number of input tokens.

const messages = limitMessageTokens(
chats[currentChatIndex].messages,
chats[currentChatIndex].config.max_tokens,
chats[currentChatIndex].config.model
);

So modelMaxToken needs to be set to the value in the Context Window and should be 128000 in gpt-4-0125-preview.
I think the proper approach is to match the description with the actual behavior, but it is not clear to me which is correct, the description or the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants