Add Chinese tasks (C-MTEB) #134

staoxiao · 2023-08-08T18:25:33Z

No description provided.

Muennighoff

Looks great, amazing job!

Sent you a few more comments via mail~

scripts/run_mteb_chinese.py

README.md

NouamaneTazi

Very clean PR, thanks a lot! 🙌
Left some small comments before merging this!

mteb/evaluation/evaluators/RerankingEvaluator.py

mteb/tasks/Retrieval/CMTEBRetrieval.py

README.md

Muennighoff · 2023-08-26T18:10:46Z

I think everything is resolved - merging this! Feel free to still make changes later 😇

NouamaneTazi

LGTM! Thanks 🚀

@staoxiao

This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included. All the points are attributed to @staoxiao, though we can split them if needed. We also added points for review.

@staoxiao

* docs: Added missing points for #214 Added 6x2 points for guenthermi for datasets and 1 point to Muennighoff for review I have not accounted for bonus points as I am not sure was what available at the time. * docs: added point for #197 Added 2 points for rasdani and 2 bonus points for the first german retrieval (I believe). Added one point for each of the reviewers * docs: added points for #116 This includes 6 points for 3 datasets to slvnwhrl +2 for first german clustering task also added points for reviews * Added points for #134 cmteb This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included. All the points are attributed to @staoxiao, though we can split them if needed. We also added points for review. * docs: Added points for #137 polish This includes points for 12 datasets (24) across 4 tasks (8). These points are given to rafalposwiata and then one point for review * docs: Added points for #27 (spanish) These include 9 datasets (18 points) across 4 news tasks (8) for spanish. Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed. * docs: Added points for #224 Added points 2 points for the dataset. I could imagine that I might have missed some bonus points as well. Also added one point for review. * docs: Added points for #210 (korean) This include 3 datasets (6 points) across 1 new task (+2 bonus) for korean. Also added 1 points for reviewers. * Add contributor --------- Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

shitao added 2 commits August 9, 2023 02:19

add C_MTEB

c88cdf7

add C_MTEB

9f75aeb

Muennighoff approved these changes Aug 8, 2023

View reviewed changes

scripts/run_mteb_chinese.py Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

shitao added 2 commits August 9, 2023 22:11

rename MMarcoReranking

3cd902e

rename MMarcoReranking

5289c16

NouamaneTazi requested changes Aug 12, 2023

View reviewed changes

mteb/evaluation/evaluators/RerankingEvaluator.py Show resolved Hide resolved

mteb/tasks/Retrieval/CMTEBRetrieval.py Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

NouamaneTazi added 2 commits August 12, 2023 20:34

Update mteb/tasks/Retrieval/CMTEBRetrieval.py

a948616

Update README.md

4ffddff

Muennighoff mentioned this pull request Aug 21, 2023

Could I fine tune this model for Chinese datasets? Muennighoff/sgpt#41

Open

Allow custom encode functions

fcfc41b

Muennighoff requested a review from NouamaneTazi August 26, 2023 18:11

Merge branch 'main' into main

b825b71

NouamaneTazi approved these changes Aug 26, 2023

View reviewed changes

NouamaneTazi merged commit 071974a into embeddings-benchmark:main Aug 26, 2023

Muennighoff mentioned this pull request Apr 4, 2024

Adding French team contribution points #302

Merged

Muennighoff mentioned this pull request May 14, 2024

Reallocate CMTEB credits #711

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Chinese tasks (C-MTEB) #134

Add Chinese tasks (C-MTEB) #134

staoxiao commented Aug 8, 2023

Muennighoff left a comment

NouamaneTazi left a comment

Muennighoff commented Aug 26, 2023

NouamaneTazi left a comment

Add Chinese tasks (C-MTEB) #134

Add Chinese tasks (C-MTEB) #134

Conversation

staoxiao commented Aug 8, 2023

Muennighoff left a comment

Choose a reason for hiding this comment

NouamaneTazi left a comment

Choose a reason for hiding this comment

Muennighoff commented Aug 26, 2023

NouamaneTazi left a comment

Choose a reason for hiding this comment