Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the reranker model? Do i need to modify the xgboost reranker with different dataset? #10

Open
forestemperor opened this issue May 31, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@forestemperor
Copy link

as my title goes.

@forestemperor forestemperor added the question Further information is requested label May 31, 2024
@jotyy jotyy assigned jotyy and zhiheng-huang and unassigned jotyy May 31, 2024
@zhiheng-huang
Copy link
Contributor

zhiheng-huang commented May 31, 2024

  1. In https://retriever.denser.ai/docs/experiments/mteb_retrieval, we stated that "For each dataset in MTEB, we trained an xgboost models on the training dataset and tested on the test dataset.". So yes, you need use different re-ranker models on different datasets to replicate the 15 datasets results reported in the url. 2) As MSMARCO dataset is a large dataset, you can try it first to see if it fits your use cases/data. 3) We may introduce a global model which will be trained on all datasets combined later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants