From f026b871e0e581743fcb09d1eb309e9698767a8d Mon Sep 17 00:00:00 2001
From: Jimmy Lin <jimmylin@uwaterloo.ca>
Date: Wed, 8 Sep 2021 16:08:48 -0400
Subject: [PATCH] Add BPR repro entry and doc tweaks (#753)

---
 README.md               | 37 +++++++++++++++++++------------------
 docs/experiments-bpr.md | 13 ++++++++-----
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/README.md b/README.md
index 4c92fa624..225cc43cd 100644
--- a/README.md
+++ b/README.md
@@ -385,27 +385,28 @@ With Pyserini, it's easy to [reproduce](docs/reproducibility.md) runs on a numbe
 
 ### Sparse Retrieval
 
-+ [Reproducing runs directly from the Python package](docs/pypi-reproduction.md)
-+ [Reproducing Robust04 baselines for ad hoc retrieval](docs/experiments-robust04.md)
-+ [Reproducing the BM25 baseline for MS MARCO (V1) Passage Ranking](docs/experiments-msmarco-passage.md)
-+ [Reproducing the BM25 baseline for MS MARCO (V1) Document Ranking](docs/experiments-msmarco-doc.md)
-+ [Reproducing the multi-field BM25 baseline for MS MARCO (V1) Document Ranking from Elasticsearch](docs/experiments-elastic.md)
-+ [Reproducing BM25 baselines on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2.md)
-+ [Reproducing DeepImpact experiments for MS MARCO (V1) Passage Ranking](docs/experiments-deepimpact.md)
-+ [Reproducing uniCOIL experiments for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil.md)
-+ [Reproducing uniCOIL experiments with TILDE document expansion for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil-tilde-expansion.md)
-+ [Reproducing uniCOIL experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-unicoil.md)
++ Reproducing [runs directly from the Python package](docs/pypi-reproduction.md)
++ Reproducing [Robust04 baselines for ad hoc retrieval](docs/experiments-robust04.md)
++ Reproducing the [BM25 baseline for MS MARCO (V1) Passage Ranking](docs/experiments-msmarco-passage.md)
++ Reproducing the [BM25 baseline for MS MARCO (V1) Document Ranking](docs/experiments-msmarco-doc.md)
++ Reproducing the [multi-field BM25 baseline for MS MARCO (V1) Document Ranking from Elasticsearch](docs/experiments-elastic.md)
++ Reproducing [BM25 baselines on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2.md)
++ Reproducing [DeepImpact experiments for MS MARCO (V1) Passage Ranking](docs/experiments-deepimpact.md)
++ Reproducing [uniCOIL experiments with doc2query-T5 expansions for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil.md)
++ Reproducing [uniCOIL experiments with TILDE document expansion for MS MARCO (V1) Passage Ranking](docs/experiments-unicoil-tilde-expansion.md)
++ Reproducing [uniCOIL experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-unicoil.md)
 
 ### Dense Retrieval
 
-+ [Reproducing TCT-ColBERTv1 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert.md)
-+ [Reproducing TCT-ColBERTv2 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert-v2.md)
-+ [Reproducing TCT-ColBERTv2 experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-tct_colbert-v2.md)
-+ [Reproducing DPR experiments](docs/experiments-dpr.md)
-+ [Reproducing ANCE experiments](docs/experiments-ance.md)
-+ [Reproducing DistilBERT KD experiments](docs/experiments-distilbert_kd.md)
-+ [Reproducing DistilBERT Balanced Topic Aware Sampling experiments](docs/experiments-distilbert_tasb.md)
-+ [Reproducing SBERT dense retrieval experiments](docs/experiments-sbert.md)
++ Reproducing [TCT-ColBERTv1 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert.md)
++ Reproducing [TCT-ColBERTv2 experiments on the MS MARCO (V1) Collections](docs/experiments-tct_colbert-v2.md)
++ Reproducing [TCT-ColBERTv2 experiments on the MS MARCO (V2) Collections](docs/experiments-msmarco-v2-tct_colbert-v2.md)
++ Reproducing [DPR experiments](docs/experiments-dpr.md)
++ Reproducing [BPR experiments](docs/experiments-bpr.md)
++ Reproducing [ANCE experiments](docs/experiments-ance.md)
++ Reproducing [DistilBERT KD experiments](docs/experiments-distilbert_kd.md)
++ Reproducing [DistilBERT Balanced Topic Aware Sampling experiments](docs/experiments-distilbert_tasb.md)
++ Reproducing [SBERT dense retrieval experiments](docs/experiments-sbert.md)
 
 ## Baselines
 
diff --git a/docs/experiments-bpr.md b/docs/experiments-bpr.md
index 2e848578b..e0dba523f 100644
--- a/docs/experiments-bpr.md
+++ b/docs/experiments-bpr.md
@@ -1,16 +1,17 @@
 # Pyserini: Reproducing BPR Results
 
-[Binary passage retriever](https://arxiv.org/abs/2106.00882) (BPR) is a two-stage ranking approach that represents the passages in both binary codes and dense vectors for memory efficiency and effectiveness.
+Binary passage retriever (BPR) is a two-stage ranking approach that represents the passages in both binary codes and dense vectors for memory efficiency and effectiveness.
 
-We have replicated BPR's results and incorporated the technique into Pyserini.
+> Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi. [Efficient Passage Retrieval with Hashing for Open-domain Question Answering.](https://aclanthology.org/2021.acl-short.123/) _Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)_, pages 979-986, 2021.
+
+We have replicated BPR's results and incorporated the model into Pyserini.
 To be clear, we started with model checkpoint and index releases in the official [BPR repo](https://github.com/studio-ousia/bpr) and did _not_ train the query and passage encoders from scratch.
 
 This guide provides instructions to reproduce the BPR's results.
-We cover only retrieval here; for end-to-end answer extraction, please see [this guide](https://github.com/castorini/pygaggle/blob/master/docs/experiments-dpr-reader.md) in our PyGaggle neural text ranking library. For more instructions, please see our [dense retrieval replication guide](https://github.com/castorini/pyserini/blob/master/docs/experiments-dpr.md).
 
 ## Summary
 
-Here's how our results stack up against results reported in the paper using the BPR model (index 2.3GB + model 0.4GB):
+Here's how our results stack up against results reported in the paper using the BPR model (index 2.3 GB + model 0.4 GB):
 
 | Dataset     | Method        | Top-20 (orig) | Top-20 (us)| Top-100 (orig) | Top-100 (us)|
 |:------------|:--------------|--------------:|-----------:|---------------:|------------:|
@@ -19,7 +20,7 @@ Here's how our results stack up against results reported in the paper using the
 
 ## Natural Questions (NQ) with BPR
 
-**DPR retrieval** with brute-force index:
+BPR with brute-force index:
 
 ```bash
 $ python -m pyserini.dsearch --topics dpr-nq-test \
@@ -48,3 +49,5 @@ Top100 accuracy: 0.857
 ```
 
 ## Reproduction Log[*](reproducibility.md)
+
++ Results reproduced by [@lintool](https://github.com/lintool) on 2021-09-08 (commit [`d7a7be`](https://github.com/castorini/pyserini/commit/d7a7bededc650dfa87eb89ba92907fd97a10310b))