-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BEIR-PL datasets to MTEB #121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work!
I would standardize the names to all be e.g. DBPediaPL
.
Currently, it seems there's HotpotQAPL
, but also DBPedia-pl
& ArguAna-PL
.
Do you want to merge this first & then lateron add the missing datasets (CQA, Touche etc.) in a separate PR?
I have updated names to We can merge it and I will add another PR for all CQADupstack and Touche datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Can you run all tasks for at least 1 model and provide the result files here? I will then update the leaderboard to have a tab for BEIR-PL
Hi, I have evaluated "distiluse-base-multilingual-cased-v2" from SentenceTransformers. Attaching results. Also did some minor changes that were required to run the evaluation. |
from .BeIRTask import * | ||
from .CrosslingualTask import * | ||
from .MultilingualTask import * | ||
from .BeIRPLTask import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from .BeIRTask import * | |
from .CrosslingualTask import * | |
from .MultilingualTask import * | |
from .BeIRPLTask import * | |
from .BeIRPLTask import * | |
from .BeIRTask import * | |
from .CrosslingualTask import * | |
from .MultilingualTask import * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, amazing work!
I've added the results in a new leaderboard tab for Polish under Retrieval: https://huggingface.co/spaces/mteb/leaderboard
Feel free to add more models by either sending the result files or adding the results to the model card of the models.
Will merge this if fine with you!
Add tasks and updated README for datasets in BEIR-PL (BEIR benchmark in Polish language).