01 Apr 12:08

KennethEnevoldsen

a4f77de

1.4.0

1.4.0 (2024-04-01)

Feature

feat: Added windows support by replacing pytrec-eval with pytrec-eval-terrier (#292)
ci: Added windows to test suite
feat: Changed to pytrec-eval-terrier to add support for windows installs (fc0e105)

Assets 4

01 Apr 08:51

KennethEnevoldsen

1.3.4

39d005b

1.3.4

1.3.4 (2024-04-01)

Fix

fix: Update MindSmallReranking.py to have the correct hf reference (#303) (102e24e)

Assets 4

31 Mar 15:23

KennethEnevoldsen

1.3.3

a8b3584

1.3.3

1.3.3 (2024-03-31)

Documentation

docs: Added information related to the automatic release (#290)
docs: added information related to the automatic release
docs: removed test-parallel from docs
docs: minor additions to contributing guidelines
ci: removed changelog

As it already present in the git releases

Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (6821d23)

Fix

fix: fixed bug introduced in TatoebaBitextMining causing it to use a different dataset (#297) (d0549a3)
fix: Fixed mispecified rev. id for datasets (#298)
fix: fixed wrong rev. id for ToxicConversationsClassification
fix: fixed wrong rev. id with RedditClusteringP2P (e1ae0d3)

Assets 4

29 Mar 13:09

KennethEnevoldsen

1.3.2

3746f6e

1.3.2

1.3.2 (2024-03-29)

Documentation

docs: Update links in README.md (#296) (76056b5)

Fix

fix: Added tasks from SEB (#287)
Added tasks from SEB
docs: fix link
fix: ran linting
fix typing for 3.8
fixed annotation for v3.8 (39cff49)

Assets 4

26 Mar 20:07

KennethEnevoldsen

1.3.1

bf69588

1.3.1

1.3.1 (2024-03-26)

Fix

fix: updated version in transition to semantic release ci (238ab82)

Assets 4

26 Mar 12:40

KennethEnevoldsen

v0.10.0

f2587fd

v0.10.0

v0.10.0 (2024-03-26)

Ci

ci: renamed test job and workflow (#282)

ci: Added tests (6675bb8)

Documentation

docs: typos in readme (#268) (aa9234c)
docs: add dataset schemas (#255)
docs: update AbsTaskClassification.py document schema for classification task
update AbsTaskBitextMining.py
update BornholmskBitextMining.py
update AbsTaskClustering.py and BlurbsClusteringP2P.py
update 8 files
update 9 files
update AbsTaskReranking.py
update BlurbsClusteringP2P.py
update CMTEBPairClassification.py
update GerDaLIRRetrieval.py
update 7 files
update AbsTaskBitextMining.py
update AbsTaskClassification.py (c3ce1ac)
docs: Add development installation instructions (#246)
docs: Add development installation instructions
removed unused requirements file

I don't believe this is nec. with the setup.py specifying the same dependencies

docs: Updated make file with new dependencies
ci: Update ci to use make commands

This ensure that the user runs exactly what the CI expects

ci: Avoid specifying tests folder as it causes issuew ith tests
ci: removed unec. args for test ci
Added dev install (0048878)

Feature

feat: update revision id of wikicitiesclustering task (fb90c02)

Fix

fix: dead link in readme (ecbb776)
fix: Added sizes to the metadata (#276)
restructing the readme
added mmteb
removed unec. method
Added docstring to metadata
Updated outdated examples
formatting documents
fix: Updated form to be parsed correctly
fix: Added sizes to the metadata

this allow for automatic metadata generations

Updated based on feedback
Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

updated based on feedback
Added suggestion from review
added correction based on review
reformatted empty fields to None

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (cd4a012)

fix: remove debugging print statement (d292d93)
fix: pass parallel_retrieval kwarg to use DenseRetrievalParallelExactSearch (19b8f66)
fix: msmarco-v2 uses dev.tsv, not dev1.tsv (6908d21)
fix: add missing task-langs attribute (#152) (bc22909)

Refactor

refactor: add metadata basemodel (#260)
refactor: rename description to metadata dict
refactor: add TaskMetadata and first example
update 9 files
update TaskMetadata.py
update TaskMetadata.py
update TaskMetadata.py
update LICENSE, TaskMetadata.py and requirements.dev.txt
update 151 files
update 150 files
update 43 files and delete 1 file
update 106 files
update 45 files
update 6 files
update 14 files
Added model results to repo and updated CLI to create consistent folder structure. (#254)
Added model results to repo and updated CLI to create consistent folder structure.
ci: updated ci to use make install
Added missing pytest dependencies
Update README.md

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

Restructing the readme (#262)
restructing the readme
removed double specification of versions and moved all setup to pyproject.toml
correctly use flat-layout for the package
build(deps): update TaskMetadata.py and pyproject.toml
update 221 files
build(deps): update pyproject.toml
build(deps): update pyproject.toml
build(deps): update pyproject.toml

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (dd5d617)

Unknown

Ci-fix (#289)
added release pipeline
v1.3.0
ci: moved release to the correct folder (7f56c1a)
v1.3.0
added release pipeline
v1.3.0 (5e4d10e)
tests: speed up tests (#283)

update Makefile and test_all_abstasks.py (2155bf6)

update TaskMetadata.py (#281) (acfd7d4)
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (c9d1a03)
Enable ruff ci (#279)
restructing the readme
added mmteb
removed unec. method
Added docstring to metadata
Updated outdated examples
formatting documents
fix: Updated form to be parsed correctly
fix: Added sizes to the metadata

this allow for automatic metadata generations

Updated based on feedback
Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

updated based on feedback
Added suggestion from review
added correction based on review
reformatted empty fields to None
CI: Enable linter

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (a16eb07)

Added MMTEB (#275)
restructing the readme
added mmteb
removed unec. method
Added docstring to metadata
Updated outdated examples
formatting documents
fix: Updated form to be parsed correctly
Updated based on feedback
Apply suggestions from code review

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

updated based on feedback
Added suggestion from review
added correction based on review

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (c0dc49a)

dev: add ruff as suggested extension (#274) (b08913f)
dev: add isort (#271)
dev: add isort
dev: add isort (845099d)
dev: run tests on pull request towards any branch (13f759a)
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (b42abe4)
replaced linter with ruff (#265)
restructing the readme
removed double specification of versions and moved all setup to pyproject.toml
correctly use flat-layout for the package
replaced linter with ruff
rerun tests
ci: Added in newer workflow

some of them are disables as they require other issues to be solved

Update Makefile

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (023e881)

Restructing the readme (#262)
restructing the readme
removed double specification of versions and moved all setup to pyproject.toml
correctly use flat-layout for the package (769157b)
restructing the readme (364be7f)
Added model results to repo and updated CLI to create consistent folder structure. (#254)
Added model results to repo and updated CLI to create consistent folder structure.
ci: updated ci to use make install
Added missing pytest dependencies
Update README.md

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> (8a758bc)

dev: add workspace defaults in VSCode (#253)
dev: add black as default formatter in vscode
Update .vscode/settings.json

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> (30e5b9e)

Add Danish Discourse dataset (#247)
misc.
update dd...

Contributors

Property

Assets 4

06 Mar 19:20

Muennighoff

1.2.0

9e9dca8

1.2.0 Spanish & French, Simpler Retrieval

Updates

🇪🇸 New Spanish datasets thanks to @violenil & team 🚀
🇫🇷 New French datasets thanks to @GabrielSequeira & team + there's a new French Overall leaderboard tab thanks to their massive benchmarking 🥇
Retrieval has become much simpler and is now standardized to align with other tasks. You can inspect all Retrieval datasets on the hub, it is much easier to add new Retrieval datasets now & there are fewer dependencies making installing MTEB easier 😊 While this change is backward-compatible, it represents a significant change in how MTEB works, thus we decided to increment the minor for this release (1.1.2 -> 1.2.0).

What's Changed

Add tasks for Spanish Embedding Evaluation by @violenil in #227
Extend MTEB with French datasets by @GabrielSequeira in #218
Remove HAGRID from french benchmark by @MathieuCiancone in #235
Fixed missing revision error on Norwegian Bitext Mining by @x-tabdeveloping in #221
Simplify retrieval by @Muennighoff in #233

New Contributors

@GabrielSequeira made their first contribution in #218
@MathieuCiancone made their first contribution in #235

Full Changelog: 1.1.2...1.2.0

Contributors

x-tabdeveloping, GabrielSequeira, and 3 other contributors

Assets 2

16 Feb 07:56

Muennighoff

1.1.2

def3c91

1.1.2 New English, German, Korean datasets & bug fixes

What's Changed

fix RerankingEvaluator's compute_metrics_individual by @novak2000 in #165
Add Long Document Evaluation Datasets by @violenil in #166
Fix medrxiv mislinkage by @zhimin-z in #187
Fix Dalaj linkage by @zhimin-z in #195
Fix SummEval linkage by @zhimin-z in #194
Fix SweFAQ linkage by @zhimin-z in #193
Added Norwegian Bokmål-Nynorsk bitext mining task by @x-tabdeveloping in #202
Add support for cache results by @hongjin-su in #207
Retrieval benchmark based on GermanQuAD by @rasdani in #197
Refer to other works by @Muennighoff in #212
Fix selection of DRES/DRPES by @Markus28 in #179
Add tasks for German Embedding Evaluation by @guenthermi in #214
only save top-k by @hongjin-su in #209
Add MultiLongDocRetrieval task to MTEB. by @hanhainebula in #224
Add Korean Text Search Tasks to MTEB by @taeminlee in #210
Update BeIRPLTask.py by @kwojtasi in #225
Add task list by @Muennighoff in #228

New Contributors

@novak2000 made their first contribution in #165
@violenil made their first contribution in #166
@zhimin-z made their first contribution in #187
@x-tabdeveloping made their first contribution in #202
@hongjin-su made their first contribution in #207
@rasdani made their first contribution in #197
@Markus28 made their first contribution in #179
@hanhainebula made their first contribution in #224
@taeminlee made their first contribution in #210

Full Changelog: 1.1.1...1.1.2

Contributors

taeminlee, guenthermi, and 10 other contributors

Assets 2

20 Sep 15:31

Muennighoff

1.1.1

d3aaf4f

1.1.1 C-MTEB. PL-MTEB, Multi-GPU

Updates

🇨🇳 C-MTEB was released and integrated thanks to @staoxiao. Check out the paper here. Together with C-MTEB, the team also released other great embedding resources such as new SoTA models on MTEB & C-MTEB called BGE, as well as datasets and source code 🚀
🇵🇱 PL-MTEB & BEIR-PL was released and integrated thanks to @rafalposwiata & @kwojtasi. Check out the new leaderboard tab for PL-MTEB: https://huggingface.co/spaces/mteb/leaderboard. Some BEIR-PL datasets are still missing and will be added soon cc @kwojtasi 😇
💻 Clarifications on multi-GPU: Native multi-GPU support for Retrieval thanks to @NouamaneTazi. We also added a clarification in the README on how any task can be run in a multi-GPU setup without requiring any changes in MTEB. MTEB abstracts the way the encodings are produced. Whether users use multiple or a single GPU in the encode function is completely flexible 😊

What's Changed

Code cleanup by @NouamaneTazi in #131
Replaced prints with logging by @KennethEnevoldsen in #133
Add BEIR-PL datasets to MTEB by @kwojtasi in #121
Add Polish tasks (PL-MTEB) by @rafalposwiata in #137
Add Chinese tasks (C-MTEB) by @staoxiao in #134
Support Multi-node Evaluation by @NouamaneTazi in #132
Add multi gpu eval to readme by @NouamaneTazi in #140
Default to false by @Muennighoff in #143
Rely on standard encode kwargs only by @Muennighoff in #145
Fix splits by @Muennighoff in #149
fix: add missing task-langs attribute by @guenthermi in #152
Clarify multi-gpu usage by @Muennighoff in #153
Simplify code snippets by @Muennighoff in #154
fix: msmarco-v2 uses dev.tsv, not dev1.tsv by @garrett361 in #155
Fix eval langs by @Muennighoff in #157

New Contributors

@kwojtasi made their first contribution in #121
@rafalposwiata made their first contribution in #137
@staoxiao made their first contribution in #134
@guenthermi made their first contribution in #152
@garrett361 made their first contribution in #155

Full Changelog: 1.1.0...1.1.1

Contributors

guenthermi, rafalposwiata, and 6 other contributors

Assets 2

31 Jul 09:21

Muennighoff

1.1.0

80d0344

1.1.0 New languages, default cluster setting & default error raising

Updates

🇩🇰🇳🇴🇸🇪 New Danish, Norwegian and Swedish BitextMining & Classification tasks AngryTweetsClassification, BornholmBitextMining, DKHateClassification, DalajClassification, LccSentimentClassification, NordicLangClassification, NorwegianParliament, ScalaDaClassification, ScalaNbClassification & ScalaSvClassification thanks to @KennethEnevoldsen
🇩🇪 New German Clustering tasks BlurbsClusteringP2P, BlurbsClusteringS2S, TenKGnadClusteringP2P & TenKGnadClusteringS2S thanks to @slvnwhrl
❉ Change in cluster initialization from 3 to the sklearn recommended default of auto. This leads to tiny changes in clustering scores going forward and hence makes this release not backwards-compatible. See here for a discussion. Thanks to @stephantul for this change.
❌ Errors are now directly raised by default. This behavior can be deactivated by passing a kwarg at evaluation. Previously, they were just written to a .txt file. Thanks to @KennethEnevoldsen for introducing this change.
💻 Code cleanups thanks to @stephantul @izhx @permutohedra
📈 The leaderboard has also improved a lot with new task-based rankings, better caching and many new models

What's Changed

Fix kNN Multiclass by @Muennighoff in #92
Fix SemmEval description by @ahoho in #97
Make inputs always List[str] & call in one by @Muennighoff in #99
Fix clustering warning by @stephantul in #104
Fix the extending of language pairs in MTEB by @izhx in #106
Add @Property annotation to description method of AbsTask by @permutohedra in #111
Add German clustering datasets by @slvnwhrl in #116
Added support for Scandinavian Languages by @KennethEnevoldsen in #124
Bump version ID and update PyPI by @KennethEnevoldsen in #128

New Contributors

@ahoho made their first contribution in #97
@stephantul made their first contribution in #104
@izhx made their first contribution in #106
@permutohedra made their first contribution in #111
@slvnwhrl made their first contribution in #116
@KennethEnevoldsen made their first contribution in #124

Full Changelog: 1.0.1...1.1.0

Contributors

permutohedra, stephantul, and 6 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.4.0 (2024-04-01)

Feature

1.3.4 (2024-04-01)

Fix

1.3.3 (2024-03-31)

Documentation

Fix

1.3.2 (2024-03-29)

Documentation

Fix

1.3.1 (2024-03-26)

Fix

v0.10.0 (2024-03-26)

Ci

Documentation

Feature

Fix

Refactor

Unknown

Contributors

Updates

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Updates

What's Changed

New Contributors

Contributors

Updates

What's Changed

New Contributors

Contributors

Releases: embeddings-benchmark/mteb

1.4.0

1.4.0 (2024-04-01)

Feature

1.3.4

1.3.4 (2024-04-01)

Fix

1.3.3

1.3.3 (2024-03-31)

Documentation

Fix

1.3.2

1.3.2 (2024-03-29)

Documentation

Fix

1.3.1

1.3.1 (2024-03-26)

Fix

v0.10.0

v0.10.0 (2024-03-26)

Ci

Documentation

Feature

Fix

Refactor

Unknown

Contributors

1.2.0 Spanish & French, Simpler Retrieval

Updates

What's Changed

New Contributors

Contributors

1.1.2 New English, German, Korean datasets & bug fixes

What's Changed

New Contributors

Contributors

1.1.1 C-MTEB. PL-MTEB, Multi-GPU

Updates

What's Changed

New Contributors

Contributors

1.1.0 New languages, default cluster setting & default error raising

Updates

What's Changed

New Contributors

Contributors