From f026b871e0e581743fcb09d1eb309e9698767a8d Mon Sep 17 00:00:00 2001 From: Jimmy Lin Date: Wed, 8 Sep 2021 16:08:48 -0400 Subject: [PATCH] Add BPR repro entry and doc tweaks (#753) --- README.md | 37 +++++++++++++++++++------------------ docs/experiments-bpr.md | 13 ++++++++----- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 4c92fa624..225cc43cd 100644 --- a/README.md +++ b/README.md @@ -385,27 +385,28 @@ With Pyserini, it's easy to [reproduce](docs/reproducibility.md) runs on a numbe ### Sparse Retrieval -+ [Reproducing runs directly from the Python package](docs/pypi-reproduction.md) -+ [Reproducing Robust04 baselines for ad hoc retrieval](docs/experiments-robust04.md) -+ [Reproducing the BM25 baseline for MS MARCO (V1) Passage Ranking](docs/experiments-msmarco-passage.md) -+ [Reproducing the BM25 baseline for MS MARCO (V1) Document Ranking](docs/experiments-msmarco-doc.md) -+ [Reproducing the multi-field BM25 baseline for MS MARCO (V1) Document Ranking from Elasticsearch](docs/experiments-elastic.md) -+ [Reproducing BM25 baselines on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2.md) -+ [Reproducing DeepImpact experiments for MS MARCO (V1) Passage Ranking](docs/experiments-deepimpact.md) -+ [Reproducing uniCOIL experiments for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil.md) -+ [Reproducing uniCOIL experiments with TILDE document expansion for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil-tilde-expansion.md) -+ [Reproducing uniCOIL experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-unicoil.md) ++ Reproducing [runs directly from the Python package](docs/pypi-reproduction.md) ++ Reproducing [Robust04 baselines for ad hoc retrieval](docs/experiments-robust04.md) ++ Reproducing the [BM25 baseline for MS MARCO (V1) Passage Ranking](docs/experiments-msmarco-passage.md) ++ Reproducing the [BM25 baseline for MS MARCO (V1) Document Ranking](docs/experiments-msmarco-doc.md) ++ Reproducing the [multi-field BM25 baseline for MS MARCO (V1) Document Ranking from Elasticsearch](docs/experiments-elastic.md) ++ Reproducing [BM25 baselines on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2.md) ++ Reproducing [DeepImpact experiments for MS MARCO (V1) Passage Ranking](docs/experiments-deepimpact.md) ++ Reproducing [uniCOIL experiments with doc2query-T5 expansions for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil.md) ++ Reproducing [uniCOIL experiments with TILDE document expansion for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil-tilde-expansion.md) ++ Reproducing [uniCOIL experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-unicoil.md) ### Dense Retrieval -+ [Reproducing TCT-ColBERTv1 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert.md) -+ [Reproducing TCT-ColBERTv2 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert-v2.md) -+ [Reproducing TCT-ColBERTv2 experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-tct_colbert-v2.md) -+ [Reproducing DPR experiments](docs/experiments-dpr.md) -+ [Reproducing ANCE experiments](docs/experiments-ance.md) -+ [Reproducing DistilBERT KD experiments](docs/experiments-distilbert_kd.md) -+ [Reproducing DistilBERT Balanced Topic Aware Sampling experiments](docs/experiments-distilbert_tasb.md) -+ [Reproducing SBERT dense retrieval experiments](docs/experiments-sbert.md) ++ Reproducing [TCT-ColBERTv1 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert.md) ++ Reproducing [TCT-ColBERTv2 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert-v2.md) ++ Reproducing [TCT-ColBERTv2 experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-tct_colbert-v2.md) ++ Reproducing [DPR experiments](docs/experiments-dpr.md) ++ Reproducing [BPR experiments](docs/experiments-bpr.md) ++ Reproducing [ANCE experiments](docs/experiments-ance.md) ++ Reproducing [DistilBERT KD experiments](docs/experiments-distilbert_kd.md) ++ Reproducing [DistilBERT Balanced Topic Aware Sampling experiments](docs/experiments-distilbert_tasb.md) ++ Reproducing [SBERT dense retrieval experiments](docs/experiments-sbert.md) ## Baselines diff --git a/docs/experiments-bpr.md b/docs/experiments-bpr.md index 2e848578b..e0dba523f 100644 --- a/docs/experiments-bpr.md +++ b/docs/experiments-bpr.md @@ -1,16 +1,17 @@ # Pyserini: Reproducing BPR Results -[Binary passage retriever](https://arxiv.org/abs/2106.00882) (BPR) is a two-stage ranking approach that represents the passages in both binary codes and dense vectors for memory efficiency and effectiveness. +Binary passage retriever (BPR) is a two-stage ranking approach that represents the passages in both binary codes and dense vectors for memory efficiency and effectiveness. -We have replicated BPR's results and incorporated the technique into Pyserini. +> Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi. [Efficient Passage Retrieval with Hashing for Open-domain Question Answering.](https://aclanthology.org/2021.acl-short.123/) _Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)_, pages 979-986, 2021. + +We have replicated BPR's results and incorporated the model into Pyserini. To be clear, we started with model checkpoint and index releases in the official [BPR repo](https://github.com/studio-ousia/bpr) and did _not_ train the query and passage encoders from scratch. This guide provides instructions to reproduce the BPR's results. -We cover only retrieval here; for end-to-end answer extraction, please see [this guide](https://github.com/castorini/pygaggle/blob/master/docs/experiments-dpr-reader.md) in our PyGaggle neural text ranking library. For more instructions, please see our [dense retrieval replication guide](https://github.com/castorini/pyserini/blob/master/docs/experiments-dpr.md). ## Summary -Here's how our results stack up against results reported in the paper using the BPR model (index 2.3GB + model 0.4GB): +Here's how our results stack up against results reported in the paper using the BPR model (index 2.3 GB + model 0.4 GB): | Dataset | Method | Top-20 (orig) | Top-20 (us)| Top-100 (orig) | Top-100 (us)| |:------------|:--------------|--------------:|-----------:|---------------:|------------:| @@ -19,7 +20,7 @@ Here's how our results stack up against results reported in the paper using the ## Natural Questions (NQ) with BPR -**DPR retrieval** with brute-force index: +BPR with brute-force index: ```bash $ python -m pyserini.dsearch --topics dpr-nq-test \ @@ -48,3 +49,5 @@ Top100 accuracy: 0.857 ``` ## Reproduction Log[*](reproducibility.md) + ++ Results reproduced by [@lintool](https://github.com/lintool) on 2021-09-08 (commit [`d7a7be`](https://github.com/castorini/pyserini/commit/d7a7bededc650dfa87eb89ba92907fd97a10310b))