-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Added sizes to the metadata #276
Conversation
this allow for automatic metadata generations
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, Kenneth! We end up with a lot of empty fields, but at least we document it clearly, and can handle it if we need the data 👍
mteb/abstasks/TaskMetadata.py
Outdated
@@ -144,3 +146,6 @@ class TaskMetadata(BaseModel): | |||
|
|||
text_creation: TEXT_CREATION_METHOD | None | |||
bibtex_citation: str | None | |||
|
|||
n_samples: dict[str, int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should specify these as dict[SPLIT_TYPE, int]
? This can be something as simple as a type-alias, but makes it semantically clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also be nice to provide a little utility that gets this info, and document the utility here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be possible to check it using the eval_splits attr.
…nto add-size-to-meta
@MartinBernstorff it seems like we generally agree on this PR. I have left a comment unresolved as I am not entirely sure what you mean, but we can add it, in a future PR. |
this allow for automatic metadata table generations