Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix selection of DRES/DRPES #179

Merged

Conversation

Markus28
Copy link
Contributor

This is a small fix for #177. Previously, DRPES was selected based on the presence of the RANK environment variable. This led to issues for us during training with pytorch lightning.

In this PR, we allow passing a keyword argument parallel_retrieval to MTEB.run, which is further processed by AbsRetrieval.evaluate and BeIRTask.load_data to toggle the selection of DRES/DRPES. As a consequence, we also need to pass all kwargs of MTEB.run to task.load_data. I am not sure whether this could possibly lead to problems in other parts of MTEB.

@Muennighoff
Copy link
Contributor

I think we can even remove DRPES entirely - Do you ever use it? I'm not sure if people are even using it & I think doing parallel retrieval is much easier to do on the model side (i.e. in the users encode function).

@Markus28
Copy link
Contributor Author

Markus28 commented Dec 6, 2023

Our team has indeed not used DRPES, since we decided to execute different tasks on the individual GPUs. However, if I understand the idea of DRPES correctly, I could imagine that it makes sense in some cases. Embedding on the model side and retrieval on the MTEB side are mostly separate, so it would be my understanding that DRPES could be used in combination with parallelism on the model side, leading to some additional speedups (?).

@jingedawang
Copy link

jingedawang commented Jan 24, 2024

I encountered the same problem when I use a single GPU. This solution should be good. Why it still not be able to merge?

@Muennighoff Muennighoff merged commit d292d93 into embeddings-benchmark:main Jan 24, 2024
3 checks passed
@Muennighoff
Copy link
Contributor

Merged sorry! I was hoping @NouamaneTazi would take a look as he wrote the code it affected;
We will refactor the BEIR integration soon to make it more usable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants