Skip to content

Commit

Permalink
Move all auto-generated regression documentation into docs/regressions (
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool authored Sep 3, 2023
1 parent c17a5dc commit 211e74f
Show file tree
Hide file tree
Showing 732 changed files with 2,263 additions and 2,266 deletions.
222 changes: 111 additions & 111 deletions README.md

Large diffs are not rendered by default.

38 changes: 19 additions & 19 deletions docs/elastirini.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ If you want to install Kibana, it's just another distribution to unpack and a si
## Indexing and Retrieval: Robust04

Once we have a local instance of Elasticsearch up and running, we can index using Elasticsearch through Elastirini.
In this example, we reproduce experiments on [Robust04](regressions-disk45.md).
In this example, we reproduce experiments on Robust04.

First, let's create the index in Elasticsearch.
We define the schema and the ranking function (BM25) using [this config](../src/main/resources/elasticsearch/index-config.robust04.json):
We define the schema and the ranking function (BM25) usingthe config at `src/main/resources/elasticsearch/index-config.robust04.json`:

```bash
cat src/main/resources/elasticsearch/index-config.robust04.json \
Expand Down Expand Up @@ -87,8 +87,8 @@ P_30 all 0.3102

## Indexing and Retrieval: Core18

We can reproduce the [TREC Washington Post Corpus](regressions-core18.md) results in a similar way.
First, set up the proper schema using [this config](../src/main/resources/elasticsearch/index-config.core18.json):
We can reproduce the TREC Washington Post Corpus results in a similar way.
First, set up the proper schema using the config at `src/main/resources/elasticsearch/index-config.core18.json`:

```bash
cat src/main/resources/elasticsearch/index-config.core18.json \
Expand Down Expand Up @@ -133,8 +133,8 @@ P_30 all 0.3573

## Indexing and Retrieval: MS MARCO Passage

We can reproduce the [BM25 Baselines on MS MARCO (Passage)](experiments-msmarco-passage.md) results in a similar way.
First, set up the proper schema using [this config](../src/main/resources/elasticsearch/index-config.msmarco-passage.json):
We can reproduce the BM25 Baselines on MS MARCO (Passage) results in a similar way.
First, set up the proper schema using the config at `src/main/resources/elasticsearch/index-config.msmarco-passage.json`:

```bash
cat src/main/resources/elasticsearch/index-config.msmarco-passage.json \
Expand Down Expand Up @@ -179,8 +179,8 @@ recall_1000 all 0.8573

## Indexing and Retrieval: MS MARCO Document

We can reproduce the [BM25 Baselines on MS MARCO (Doc)](experiments-msmarco-doc.md) results in a similar way.
First, set up the proper schema using [this config](../src/main/resources/elasticsearch/index-config.msmarco-doc.json):
We can reproduce the BM25 Baselines on MS MARCO (Doc) results in a similar way.
First, set up the proper schema using the config at `src/main/resources/elasticsearch/index-config.msmarco-doc.json`:

```bash
cat src/main/resources/elasticsearch/index-config.msmarco-doc.json \
Expand Down Expand Up @@ -227,7 +227,7 @@ recall_1000 all 0.8856

## Elasticsearch Integration Test

We have an end-to-end integration testing script `run_es_regression.py` for [Robust04](regressions-disk45.md), [Core18](regressions-core18.md), [MS MARCO passage](regressions-msmarco-passage.md) and [MS MARCO document](regressions-msmarco-doc.md):
We have an end-to-end integration testing script `run_es_regression.py` for Robust04, Core18, MS MARCO passage and MS MARCO document:

```bash
# Check if Elasticsearch server is on
Expand Down Expand Up @@ -256,16 +256,16 @@ For the `collection` meta-parameter, use `robust04`, `core18`, `msmarco-passage`

## Reproduction Log[*](reproducibility.md)

+ Results reproduced by [@nikhilro](https://github.com/nikhilro) on 2020-01-26 (commit [`d5ee069`](https://github.com/castorini/anserini/commit/d5ee069399e6a306d7685bda756c1f19db721156)) for both [MS MARCO Passage](experiments-msmarco-passage.md) and [Robust04](regressions-disk45.md)
+ Results reproduced by [@edwinzhng](https://github.com/edwinzhng) on 2020-01-26 (commit [`7b76dfb`](https://github.com/castorini/anserini/commit/7b76dfbea7e0c01a3a5dc13e74f54852c780ec9b)) for both [MS MARCO Passage](experiments-msmarco-passage.md) and [Robust04](regressions-disk45.md)
+ Results reproduced by [@HangCui0510](https://github.com/HangCui0510) on 2020-04-29 (commit [`07a9b05`](https://github.com/castorini/anserini/commit/07a9b053173637e15be79de4e7fce4d5a93d04fe)) for [MS MARCO Passage](regressions-msmarco-passage.md), [Robust04](regressions-disk45.md) and [Core18](regressions-core18.md) using end-to-end [`run_es_regression`](../src/main/python/run_es_regression.py)
+ Results reproduced by [@shaneding](https://github.com/shaneding) on 2020-05-25 (commit [`1de3274`](https://github.com/castorini/anserini/commit/1de3274b057a63382534c5277ffcd772c3fc0d43)) for [MS MARCO Passage](regressions-msmarco-passage.md)
+ Results reproduced by [@adamyy](https://github.com/adamyy) on 2020-05-29 (commit [`94893f1`](https://github.com/castorini/anserini/commit/94893f170e047d77c3ef5b8b995d7fbdd13f4298)) for [MS MARCO Passage](regressions-msmarco-passage.md), [MS MARCO Document](experiments-msmarco-doc.md)
+ Results reproduced by [@YimingDou](https://github.com/YimingDou) on 2020-05-29 (commit [`2947a16`](https://github.com/castorini/anserini/commit/2947a1622efae35637b83e321aba8e6fccd43489)) for [MS MARCO Passage](regressions-msmarco-passage.md)
+ Results reproduced by [@yxzhu16](https://github.com/yxzhu16) on 2020-07-17 (commit [`fad12be`](https://github.com/castorini/anserini/commit/fad12be2e37a075100707c3a674eb67bc0aa57ef)) for [Robust04](regressions-disk45.md), [Core18](regressions-core18.md), and [MS MARCO Passage](regressions-msmarco-passage.md)
+ Results reproduced by [@nikhilro](https://github.com/nikhilro) on 2020-01-26 (commit [`d5ee069`](https://github.com/castorini/anserini/commit/d5ee069399e6a306d7685bda756c1f19db721156)) for both MS MARCO Passage and Robust04
+ Results reproduced by [@edwinzhng](https://github.com/edwinzhng) on 2020-01-26 (commit [`7b76dfb`](https://github.com/castorini/anserini/commit/7b76dfbea7e0c01a3a5dc13e74f54852c780ec9b)) for both MS MARCO Passage and Robust04
+ Results reproduced by [@HangCui0510](https://github.com/HangCui0510) on 2020-04-29 (commit [`07a9b05`](https://github.com/castorini/anserini/commit/07a9b053173637e15be79de4e7fce4d5a93d04fe)) for MS MARCO Passage, Robust04 and Core18 using end-to-end `run_es_regression`
+ Results reproduced by [@shaneding](https://github.com/shaneding) on 2020-05-25 (commit [`1de3274`](https://github.com/castorini/anserini/commit/1de3274b057a63382534c5277ffcd772c3fc0d43)) for MS MARCO Passage
+ Results reproduced by [@adamyy](https://github.com/adamyy) on 2020-05-29 (commit [`94893f1`](https://github.com/castorini/anserini/commit/94893f170e047d77c3ef5b8b995d7fbdd13f4298)) for MS MARCO Passage, MS MARCO Document
+ Results reproduced by [@YimingDou](https://github.com/YimingDou) on 2020-05-29 (commit [`2947a16`](https://github.com/castorini/anserini/commit/2947a1622efae35637b83e321aba8e6fccd43489)) for MS MARCO Passage
+ Results reproduced by [@yxzhu16](https://github.com/yxzhu16) on 2020-07-17 (commit [`fad12be`](https://github.com/castorini/anserini/commit/fad12be2e37a075100707c3a674eb67bc0aa57ef)) for Robust04, Core18, and MS MARCO Passage
+ Results reproduced by [@lintool](https://github.com/lintool) on 2020-11-10 (commit [`e19755b`](https://github.com/castorini/anserini/commit/e19755b5fa976127830597bc9fbca203b9f5ad24)), all commands and end-to-end regression script for all four collections
+ Results reproduced by [@jrzhang12](https://github.com/jrzhang12) on 2021-01-02 (commit [`be4e44d`](https://github.com/castorini/anserini/commit/02c52ee606ba0ebe32c130af1e26d24d8f10566a)) for [MS MARCO Passage](regressions-msmarco-passage.md)
+ Results reproduced by [@tyao-t](https://github.com/tyao-t) on 2022-01-13 (commit [`06fb4f9`](https://github.com/castorini/anserini/commit/06fb4f9947ff2167c276d8893287453af7680786)) for [MS MARCO Passage](regressions-msmarco-passage.md) and [MS MARCO Document](regressions-msmarco-doc.md)
+ Results reproduced by [@d1shs0ap](https://github.com/d1shs0ap) on 2022-01-21 (commit [`a81299e`](https://github.com/castorini/anserini/commit/a81299e59eff24512d635e0d49fba6e373286469)) for [MS MARCO Document](regressions-msmarco-doc.md) using end-to-end [`run_es_regression`](../src/main/python/run_es_regression.py)
+ Results reproduced by [@jrzhang12](https://github.com/jrzhang12) on 2021-01-02 (commit [`be4e44d`](https://github.com/castorini/anserini/commit/02c52ee606ba0ebe32c130af1e26d24d8f10566a)) for MS MARCO Passage
+ Results reproduced by [@tyao-t](https://github.com/tyao-t) on 2022-01-13 (commit [`06fb4f9`](https://github.com/castorini/anserini/commit/06fb4f9947ff2167c276d8893287453af7680786)) for MS MARCO Passage and MS MARCO Document
+ Results reproduced by [@d1shs0ap](https://github.com/d1shs0ap) on 2022-01-21 (commit [`a81299e`](https://github.com/castorini/anserini/commit/a81299e59eff24512d635e0d49fba6e373286469)) for MS MARCO Document using end-to-end `run_es_regression`
+ Results reproduced by [@lintool](https://github.com/lintool) on 2022-03-21 (commit [`3d1fc34`](https://github.com/castorini/anserini/commit/3d1fc3457b993832b4682c0482b26d8271d02ec6)) for all collections
+ Results reproduced by [@lintool](https://github.com/lintool) on 2022-07-31 (commit [`2a0cb16`](https://github.com/castorini/anserini/commit/2a0cb16829b347e38801b9972b349de498dadf03)) (v0.14.4) for all collections
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
**Models**: various bag-of-words approaches

This page describes regressions for the background linking task in the [TREC 2018 News Track](http://trec-news.org/).
The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/backgroundlinking18.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/backgroundlinking18.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/backgroundlinking18.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/backgroundlinking18.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -29,7 +29,7 @@ target/appassembler/bin/IndexCollection \
The directory `/path/to/core18/` should be the root directory of the [TREC Washington Post Corpus](https://trec.nist.gov/data/wapost/), i.e., `ls /path/to/core18/`
should bring up a single JSON file.

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
**Models**: various bag-of-words approaches

This page describes regressions for the background linking task in the [TREC 2019 News Track](http://trec-news.org/).
The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/backgroundlinking19.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/backgroundlinking19.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/backgroundlinking19.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/backgroundlinking19.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -29,7 +29,7 @@ target/appassembler/bin/IndexCollection \
The directory `/path/to/core18/` should be the root directory of the [TREC Washington Post Corpus](https://trec.nist.gov/data/wapost/), i.e., `ls /path/to/core18/`
should bring up a single JSON file.

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
**Models**: various bag-of-words approaches

This page describes regressions for the background linking task in the [TREC 2020 News Track](http://trec-news.org/).
The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/backgroundlinking20.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/backgroundlinking20.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/backgroundlinking20.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/backgroundlinking20.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -29,7 +29,7 @@ target/appassembler/bin/IndexCollection \
The directory `/path/to/core18/` should be the root directory of the [TREC Washington Post Corpus *v3*](https://trec.nist.gov/data/wapost/), i.e., `ls /path/to/core18/`
should bring up a single JSON file.

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ This page documents BM25 regression experiments for [BEIR (v1.0.0) — ArguA
These experiments index the corpus in a "flat" manner, by concatenating the "title" and "text" into the "contents" field.
All the documents and queries are pre-tokenized with `bert-base-uncased` tokenizer.

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/beir-v1.0.0-arguana-flat-wp.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/beir-v1.0.0-arguana-flat-wp.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana-flat-wp.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana-flat-wp.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -27,7 +27,7 @@ target/appassembler/bin/IndexCollection \
>& logs/log.beir-v1.0.0-arguana-flat-wp &
```

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
This page documents BM25 regression experiments for [BEIR (v1.0.0) — ArguAna](http://beir.ai/).
These experiments index the corpus in a "flat" manner, by concatenating the "title" and "text" into the "contents" field.

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/beir-v1.0.0-arguana-flat.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/beir-v1.0.0-arguana-flat.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana-flat.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana-flat.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -26,7 +26,7 @@ target/appassembler/bin/IndexCollection \
>& logs/log.beir-v1.0.0-arguana-flat &
```

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ This page documents BM25 regression experiments for [BEIR (v1.0.0) — ArguA
These experiments index the "title" and "text" fields in corpus separately.
At retrieval time, a query is issued across both fields (equally weighted).

The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/beir-v1.0.0-arguana-multifield.yaml).
Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/beir-v1.0.0-arguana-multifield.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana-multifield.yaml).
Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana-multifield.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.

From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:

Expand All @@ -27,7 +27,7 @@ target/appassembler/bin/IndexCollection \
>& logs/log.beir-v1.0.0-arguana-multifield &
```

For additional details, see explanation of [common indexing options](common-indexing-options.md).
For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).

## Retrieval

Expand Down
Loading

0 comments on commit 211e74f

Please sign in to comment.