Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add replication note on Cluweb12-B13 #590

Merged
merged 1 commit into from
Apr 15, 2019

Conversation

matthew-z
Copy link
Contributor

Hi,

I tried to replicate the results on Cluweb12-B13 with the latest version of Anserini.
Most algorithms are OK, but I found that AX-related algorithms have a bit regression:

For example,

QL+AX

NDCG@20 My Result Expected
201-250 0.09833 0.1143
251-300 0.08103 0.1001
ERR@20 My Result Expected
201-250 0.09833 0.0780
251-300 0.09256 0.0896

BM25+AX

NDCG@20 My Result Expected
201-250 0.11381 0.1287
251-300 0.08010 0.0964
ERR@20 My Result Expected
201-250 0.09256 0.0780
251-300 0.07969 0.0929

The others (QL, QL+RM3, BM25, BM25+RM3) aligned well with https://github.com/castorini/Anserini/blob/master/docs/experiments-cw12b13.md

@lintool
Copy link
Member

lintool commented Apr 14, 2019

hi @matthew-z - thanks for this.

I'd like to see if we can understand the AX behavior...

A few questions:

  • For AX, did AP and P30 match?
  • What OS are you running on?
  • What version of the JVM?

@matthew-z
Copy link
Contributor Author

@lintool Thank you for the prompt reply.

MAP and P30 show the same regression:

QL+AX

MAP My Result Expected
201-250 0.0275 0.0359
251-300 0.0137 0.0186
P30 My Result Expected
201-250 0.1313 0.1513
251-300 0.0907 0.1167

Bm25+AX

MAP My Result Expected
201-250 0.0338 0.0435
251-300 0.0133 0.0180
P30 My Result Expected
201-250 0.1440 0.1840
251-300 0.0920 0.1107

My environment:

OS: Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-83-generic x86_64)
JAVA: openjdk version "1.8.0_191"

@lintool
Copy link
Member

lintool commented Apr 15, 2019

Hrm. I just reran on my end and everything matched.

The only thing I can thing of is openjdk vs. Oracle at the moment.

I'm going to approve this PR and circle back to look at this issue...

@lintool lintool merged commit 4a7148b into castorini:master Apr 15, 2019
@matthew-z matthew-z deleted the replicate_clueweb12b13 branch April 15, 2019 05:05
@matthew-z
Copy link
Contributor Author

Thank you! I will also try oracle JVM on my side

@lintool
Copy link
Member

lintool commented Apr 15, 2019

Great! In the meantime, I've merged https://github.com/castorini/Anserini/blob/master/docs/experiments-cw12b13.md

Also, I've created #592

@lintool
Copy link
Member

lintool commented Oct 17, 2019

Whatever the issue, Oracle vs. openJDK isn't the issue. All regressions run fine on damiano:

[jimmylin@damiano ~]$ java --version
openjdk 11.0.4 2019-07-16 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.4+11-LTS, mixed mode, sharing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants