Skip to content

Commit

Permalink
Fix broken ES experiment on MS MARCO doc (#1208)
Browse files Browse the repository at this point in the history
Recently changed config for field indexing in Anserini, propagating changes to Pyserini.
  • Loading branch information
lintool authored Jun 15, 2022
1 parent 2625274 commit ce5cf6c
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions docs/experiments-elastic.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ from Lucene, plus some additional words targeted at question-style queries.
We're going to use the repository's root directory as the working directory.
First, we need to download and extract the MS MARCO document dataset:

```
```bash
mkdir collections/msmarco-doc
wget https://msmarco.blob.core.windows.net/msmarcoranking/msmarco-docs.tsv.gz -P collections/msmarco-doc

Expand Down Expand Up @@ -52,6 +52,7 @@ python -m pyserini.index \
--generator DefaultLuceneDocumentGenerator \
--index indexes/msmarco-doc/lucene-index-msmarco \
--threads 4 \
--fields title url \
--storeRaw \
--stopwords docs/elastic-msmarco-stopwords.txt
```
Expand All @@ -67,7 +68,7 @@ attention to: the official metric is MRR@100, so we want to only return the top
format.

```bash
python -m pyserini.search \
python -m pyserini.search.lucene \
--topics msmarco-doc-dev \
--index indexes/msmarco-doc/lucene-index-msmarco/ \
--output runs/run.msmarco-doc.leaderboard-dev.elastic.txt \
Expand Down

0 comments on commit ce5cf6c

Please sign in to comment.