Releases: embeddings-benchmark/mteb
1.20.0
1.20.0 (2024-11-21)
Feature
-
feat: add CUREv1 retrieval dataset (#1459)
-
feat: add CUREv1 dataset
Co-authored-by: nadshe <nadia.sheikh@clinia.com>
Co-authored-by: olivierr42 <olivier.rousseau@clinia.com>
Co-authored-by: Daniel Buades Marcos <daniel@buad.es>
-
feat: add missing domains to medical tasks
-
feat: modify benchmark tasks
-
chore: benchmark naming
Co-authored-by: nadshe <nadia.sheikh@clinia.com>
Co-authored-by: olivierr42 <olivier.rousseau@clinia.com> (1cc6c9e
)
Unknown
- Update tasks table (
4408717
)
1.19.10
1.19.9
1.19.8
1.19.8 (2024-11-15)
Fix
Unknown
1.19.7
1.19.6
1.19.5
1.19.5 (2024-11-14)
Fix
-
fix: update task metadata to allow for null (#1448) (
04ac3f2
) -
fix: Count unique texts, data leaks in calculate metrics (#1438)
-
add more stat
-
add more stat
-
update statistics (
dd5d226
)
Unknown
-
Update tasks table (
f6a49fe
) -
Leaderboard: Fixed code benchmarks (#1441)
-
fixed code benchmarks
-
fix: Made n_parameters formatting smarter and more robust
-
fix: changed jina-embeddings-v3 number of parameters from 572K to 572M
-
fix: Fixed use_instuctions typo in model overview
-
fix: Fixed sentence-transformer compatibility switch
-
Ran linting
-
Added all languages, tasks, types and domains to options
-
Removed resetting options when a new benchmark is selected
-
All results now get displayed, but models that haven't been run on everything get nan values in the table (
3a1a470
) -
Leaderboard 2.0: added performance x n_parameters plot + more benchmark info (#1437)
-
Added elementary speed/performance plot
-
Refactored table formatting code
-
Bumped Gradio version
-
Added more general info to benchmark description markdown block
-
Adjusted margin an range on plot
-
Made hover information easier to read on plot
-
Made range scaling dynamic in plot
-
Moved citation next to benchmark description
-
Made titles in benchmark info bold (
76c2112
)
1.19.4
1.19.4 (2024-11-11)
Fix
- fix: Add missing benchmarks in benchmarks.py (#1431)
-
fix: Add Korean AutoRAGRetrieval (#1388)
-
feat: add AutoRAG Korean embedding retrieval benchmark
-
fix: run --- 🧹 Running linters ---
ruff format . # running ruff formatting
716 files left unchanged
ruff check . --fix # running ruff linting
All checks passed! -
fix: add metadata for AutoRAGRetrieval
-
change link for markers_bm
-
add AutoRAGRetrieval to init.py and update metadata
-
add precise metadata
-
update metadata: description and license
-
delete descriptive_stats in AutoRAGRetrieval.py and run calculate_matadata_metrics.py (
f79d9ba
) -
fix: make samples_per_label a task attribute (#1419)
make samples_per_label a task attr (7f1a1d3
)
Unknown
- Update tasks table (
d069aba
)