Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use dynamic batch size for embedding #826

Merged
merged 1 commit into from
Sep 3, 2024
Merged

feat: use dynamic batch size for embedding #826

merged 1 commit into from
Sep 3, 2024

Conversation

sigoden
Copy link
Owner

@sigoden sigoden commented Sep 3, 2024

The PR added a new field max_tokens_per_chunk to the embedding model. This field indicates the maximum number of tokens allowed per text chunk. The field max_input_tokens indicates the maximum number of total tokens in each request.

The batch_size can now be dynamically calculated as max_tokens/rag_chunk_size and must not exceed max_batch_size.

This resolves the issue mentioned in #825.

@sigoden sigoden merged commit 476d29c into main Sep 3, 2024
3 checks passed
@sigoden sigoden deleted the feat branch September 3, 2024 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant