Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Reproduction Logs #2432

Merged
merged 1 commit into from
Mar 29, 2024
Merged

Update Reproduction Logs #2432

merged 1 commit into from
Mar 29, 2024

Conversation

SyedHuq28
Copy link
Contributor

Configuration:
Processor Speed (Max): 1800 MHz
Memory (RAM): 16 GB
Processor: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz (GenuineIntel)
Operating System: Microsoft Windows 10 Pro (64-bit)
Number of Cores: 4
Number of Logical Processors: 8

Initially, I tried to solve the problems associated with the repo Anserini using PowerShell and converting the scripts to PowerShell format for Windows. However, I encountered a few issues while running on Windows. For instance, installing Maven and trec_eval on Windows resulted in errors. Consequently, I resorted to virtualization on my computer using WSL (Ubuntu 22.04.4). One issue I encountered while using Ubuntu was that when I extracted any compressed files from the Ubuntu terminal, later on, when I attempted to read or write to that file, it denied permission. This occurred because my Ubuntu terminal account had associated passwords and any extracted files required permission to be accessed. Therefore, I had to manually extract compressed files to avoid read/write permission issues rather than using a script for extraction.

Besides, my Windows was running in a 32-bit configuration, whereas the structure of trec_eval was designed for 64-bit systems. Despite this hurdle and other dependency installation and usage problems, I solved them using WSL (Ubuntu 22.04.4).

Specifications of WSL:
Ubuntu 22.04.4
Linux 5.15.146.1
Architecture: microsoft-standard-WSL2 x86_64
OpenJDK 11.0.22

Another major issue that I encountered and resolved was in the retrieval script, mentioned in experiments-msmarco-passage.md and pom.xml file. In case of pom.xml, I had to do version controlling of the Maven Assembly Plugin, to make it compatible with my configuration.

For the retrieval script, I changed the following:
target/appassembler/bin/SearchCollection
-index indexes/msmarco-passage/lucene-index-msmarco
-topics collections/msmarco-passage/queries.dev.small.tsv
-topicReader TsvInt
-output runs/run.msmarco-passage.dev.small.tsv -format msmarco
-parallelism 4
-bm25 -bm25.k1 0.82 -bm25.b 0.68 -hits 1000

To:
target/appassembler/bin/SearchCollection
-index indexes/msmarco-passage/lucene-index-msmarco
-topics collections/msmarco-passage/queries.dev.small.tsv
-topicReader TsvInt
-output runs/run.msmarco-passage.dev.small.tsv -format msmarco
-parallelism 4
-bm25 "-bm25.k1" 0.82 "-bm25.b" 0.68 -hits 1000

In summary, I initially attempted tasks on Windows primarily using PowerShell terminal in VS Code. After encountering some errors and making some progress, I completed the remaining tasks using WSL (Ubuntu). With the mentioned changes and tweaks, I was able to achieve the expected results and reproduce all of the results accurately.

Copy link

codecov bot commented Mar 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.88%. Comparing base (5fb697a) to head (d4dc380).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #2432   +/-   ##
=========================================
  Coverage     66.88%   66.88%           
  Complexity     1410     1410           
=========================================
  Files           208      208           
  Lines         12110    12110           
  Branches       1487     1487           
=========================================
  Hits           8100     8100           
  Misses         3494     3494           
  Partials        516      516           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lintool lintool merged commit 279fc3e into castorini:master Mar 29, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants