Multi-GPU for more tasks #27

Muennighoff · 2022-07-23T15:06:36Z

It'd be great if we could figure out using multiple gpus on tasks other than BEIR.
E.g. RedditClusteringP2P takes >20h for a 5.8B model with embeddings of 4096 dimensions.

The text was updated successfully, but these errors were encountered:

dylan-hum · 2023-08-08T22:18:05Z

I wrote a wrapper for the model like this, it seems to be working so far.

class MultiGPUModel(object):
    def __init__(self, sentence_transformer: SentenceTransformer):
        self.transformer = sentence_transformer
        self.gpu_pool = self.transformer.start_multi_process_pool()

    def encode(self, sentences, **kwargs):
        return self.transformer.encode_multi_process(sentences, self.gpu_pool, **kwargs)

Then you can pass it like this:

from mteb import MTEB
from sentence_transformers import SentenceTransformer

# Define the sentence-transformers model name
model_name = "average_word_embeddings_komninos"

model = SentenceTransformer(model_name)
multigpu_model = MultiGPUModel(model)
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(multigpu_model , output_folder=f"results/{model_name}")

Muennighoff · 2024-03-05T12:47:24Z

I wrote a wrapper for the model like this, it seems to be working so far.

class MultiGPUModel(object):
    def __init__(self, sentence_transformer: SentenceTransformer):
        self.transformer = sentence_transformer
        self.gpu_pool = self.transformer.start_multi_process_pool()

    def encode(self, sentences, **kwargs):
        return self.transformer.encode_multi_process(sentences, self.gpu_pool, **kwargs)

Then you can pass it like this:

from mteb import MTEB
from sentence_transformers import SentenceTransformer

# Define the sentence-transformers model name
model_name = "average_word_embeddings_komninos"

model = SentenceTransformer(model_name)
multigpu_model = MultiGPUModel(model)
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(multigpu_model , output_folder=f"results/{model_name}")

As highlighted in this wrapper, I think this can be done much more easily on the user side and thus does not need to be in MTEB ; Also see #233 :)

These include 9 datasets (18 points) across 4 news tasks (8) for spanish. Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed.

@staoxiao

* docs: Added missing points for #214 Added 6x2 points for guenthermi for datasets and 1 point to Muennighoff for review I have not accounted for bonus points as I am not sure was what available at the time. * docs: added point for #197 Added 2 points for rasdani and 2 bonus points for the first german retrieval (I believe). Added one point for each of the reviewers * docs: added points for #116 This includes 6 points for 3 datasets to slvnwhrl +2 for first german clustering task also added points for reviews * Added points for #134 cmteb This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included. All the points are attributed to @staoxiao, though we can split them if needed. We also added points for review. * docs: Added points for #137 polish This includes points for 12 datasets (24) across 4 tasks (8). These points are given to rafalposwiata and then one point for review * docs: Added points for #27 (spanish) These include 9 datasets (18 points) across 4 news tasks (8) for spanish. Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed. * docs: Added points for #224 Added points 2 points for the dataset. I could imagine that I might have missed some bonus points as well. Also added one point for review. * docs: Added points for #210 (korean) This include 3 datasets (6 points) across 1 new task (+2 bonus) for korean. Also added 1 points for reviewers. * Add contributor --------- Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

NouamaneTazi added the help wanted Extra attention is needed label Aug 3, 2022

Muennighoff assigned NouamaneTazi Mar 28, 2023

Muennighoff closed this as completed Mar 5, 2024

lrybbbccc mentioned this issue Oct 16, 2024

evaluation very slow AIR-Bench/AIR-Bench#38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU for more tasks #27

Multi-GPU for more tasks #27

Muennighoff commented Jul 23, 2022

dylan-hum commented Aug 8, 2023

Muennighoff commented Mar 5, 2024

Multi-GPU for more tasks #27

Multi-GPU for more tasks #27

Comments

Muennighoff commented Jul 23, 2022

dylan-hum commented Aug 8, 2023

Muennighoff commented Mar 5, 2024