You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the feature or improvement you're requesting
It would be nice to be able to score multiple sample completions using ModelBasedClassify. Even if n>1 is passed into a completion function and multiple samples are returned, only the first is graded because of this line:
I would like to be able to raise the temperature, ask a model to produce N completions, and have each completion graded separately using a rubric. This appears to work fine for non-model-based scoring.
The text was updated successfully, but these errors were encountered:
Describe the feature or improvement you're requesting
It would be nice to be able to score multiple sample completions using ModelBasedClassify. Even if n>1 is passed into a completion function and multiple samples are returned, only the first is graded because of this line:
https://github.com/openai/evals/blob/main/evals/elsuite/utils.py#L193
Additional context
I would like to be able to raise the temperature, ask a model to produce N completions, and have each completion graded separately using a rubric. This appears to work fine for non-model-based scoring.
The text was updated successfully, but these errors were encountered: