Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Configuration:
Processor Speed (Max): 1800 MHz
Memory (RAM): 16 GB
Processor: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz (GenuineIntel)
Operating System: Microsoft Windows 10 Pro (64-bit)
Number of Cores: 4
Number of Logical Processors: 8
Initially, I tried to solve the problems associated with the repo Anserini using PowerShell and converting the scripts to PowerShell format for Windows. However, I encountered a few issues while running on Windows. For instance, installing Maven and trec_eval on Windows resulted in errors. Consequently, I resorted to virtualization on my computer using WSL (Ubuntu 22.04.4). One issue I encountered while using Ubuntu was that when I extracted any compressed files from the Ubuntu terminal, later on, when I attempted to read or write to that file, it denied permission. This occurred because my Ubuntu terminal account had associated passwords and any extracted files required permission to be accessed. Therefore, I had to manually extract compressed files to avoid read/write permission issues rather than using a script for extraction.
Besides, my Windows was running in a 32-bit configuration, whereas the structure of trec_eval was designed for 64-bit systems. Despite this hurdle and other dependency installation and usage problems, I solved them using WSL (Ubuntu 22.04.4).
Specifications of WSL:
Ubuntu 22.04.4
Linux 5.15.146.1
Architecture: microsoft-standard-WSL2 x86_64
OpenJDK 11.0.22
Another major issue that I encountered and resolved was in the retrieval script, mentioned in experiments-msmarco-passage.md and pom.xml file. In case of pom.xml, I had to do version controlling of the Maven Assembly Plugin, to make it compatible with my configuration.
For the retrieval script, I changed the following:
target/appassembler/bin/SearchCollection
-index indexes/msmarco-passage/lucene-index-msmarco
-topics collections/msmarco-passage/queries.dev.small.tsv
-topicReader TsvInt
-output runs/run.msmarco-passage.dev.small.tsv -format msmarco
-parallelism 4
-bm25 -bm25.k1 0.82 -bm25.b 0.68 -hits 1000
To:
target/appassembler/bin/SearchCollection
-index indexes/msmarco-passage/lucene-index-msmarco
-topics collections/msmarco-passage/queries.dev.small.tsv
-topicReader TsvInt
-output runs/run.msmarco-passage.dev.small.tsv -format msmarco
-parallelism 4
-bm25 "-bm25.k1" 0.82 "-bm25.b" 0.68 -hits 1000
In summary, I initially attempted tasks on Windows primarily using PowerShell terminal in VS Code. After encountering some errors and making some progress, I completed the remaining tasks using WSL (Ubuntu). With the mentioned changes and tweaks, I was able to achieve the expected results and reproduce all of the results accurately.