-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refinement: setting batch_size for different models #212
Conversation
Will format the code shortly |
Please reformat your code to resolve build failure https://github.com/CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering/actions/runs/8164445774/job/22319716558?pr=212 |
Could you please also check all notebook regarding whether you need to make corresponding changes as well as cookbook repo? Get @jojortz to review the cookbook repo. |
Noted |
if not batch_size: | ||
batch_size = self._config.model_config.get( | ||
"num_thread", 1 | ||
) # pylint: disable=no-member | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment regarding why we are hacking this way, especially the logic we were talking about regarding local hosted model and proprietary models.
uniflow/op/model/model_config.py
Outdated
@@ -26,7 +26,7 @@ class GoogleModelConfig(ModelConfig): | |||
candidate_count: int = 1 | |||
num_thread: int = 1 | |||
# this is not real batch inference, but size to group for thread pool executor. | |||
batch_size: int = 1 | |||
# batch_size: int = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should just remove them and the same for all below. Then, check again to make sure all batch_size is removed for local hosted model.
…y 2. If batch_size is not found, use num_thread instead
Since Google's model, OpenAI's model, and Azure OpenAI's model don't support batching locally, we will instead use
num_thread
to "fake batch" the data.