Make it possible to use Jelinek-Mercer QL scoring model #465

tteofili · 2018-11-02T10:23:36Z

No description provided.

Peilin-Yang · 2018-11-02T14:16:48Z

src/main/java/io/anserini/search/SearchArgs.java

+  @Option(name = "-qlmj", usage = "use Jelinek-Mercer query likelihood scoring model")
+  public boolean qlmj = false;
+
+  @Option(name = "-lambda", metaVar = "[value]", usage = "Jelinek Mercer smoothing parameter")


Can we rename it to qlmj.lambda?
I know for BM25 and QL the parameters were b and mu and we should have changed them too.
But those two were there from the very beginning of Anserini and breaking them will cause problem of regression tests.
If you look at other ranking functions like pl2 you will see its parameter as -pl2.c.

Also, since qlmj is a base ranking model could you please move these two options right under ql?

Thanks

sure, will do.
actually it should be qljm (Query Likelihood Jelinek Mercer) instead of qlmj.

Yes, and technically -ql should be -qld for (Query Likelihood Dirichlet). Such a change will likely break lots of regression testing scripts... :(

But perhaps worth filing an issue?

maybe -qld can be added, which would work exactly like -ql for backward compatibility.

This is a good idea.

lintool · 2018-11-02T14:47:04Z

Hi @tteofili thanks for your contributions! We'd love to know what you're using Anserini for?

tteofili · 2018-11-02T14:55:31Z

Hi @tteofili thanks for your contributions! We'd love to know what you're using Anserini for?

research / evaluations around usage of embeddings in IR.
so QL JM model is good to have as additional baseline.

lintool · 2018-11-02T15:15:51Z

@tteofili We've long discussed integrating word embeddings into Anserini directly... i.e., use a Lucene index as a simple key-value store for lookup up embedding vectors. Is this a feature you'd need? If so, interested in helping us build it out?

tteofili · 2018-11-02T16:10:26Z

@lintool sure, I would be very much interested.
I have a prototype Reranker I'm currently using, that can fetch existing stored embeddings or calculate them on the fly.
The not so nice part is having to choose one linear algebra / DL library unless we work with hundred dimensional extensions of Lucene PointValues.

lintool · 2018-11-02T19:48:57Z

@tteofili opening this thread for discussion #467

lintool · 2018-11-21T12:19:22Z

@tteofili were you planning on updating the PR per review comments, or should I close for now?

tteofili · 2018-11-21T13:10:25Z

yes, sure, I'll adjust the PR as per above comments.

lintool · 2018-11-28T20:54:15Z

hi @tteofili I'm going to close this PR for now... this conflicts with a recent change made by @Peilin-Yang that allows much more flexible parameter sweeping. If you still want to contribute code, I think it'll be easier to start from scratch off the current master.

tteofili · 2018-11-29T08:13:35Z

@lintool sure, thanks, it makes sense.

tteofili added 3 commits November 2, 2018 11:18

added qlmj (LMJerelinkerMercerSimilarity) scoring model

4a4d3cd

added qlmj (LMJerelinkerMercerSimilarity) scoring model

c9bd276

added qlmj (LMJerelinkerMercerSimilarity) scoring model

b50c870

Peilin-Yang reviewed Nov 2, 2018

View reviewed changes

lintool mentioned this pull request Nov 2, 2018

Open thread for discussion: integration of word embeddings in Anserini #467

Closed

updated qljm parameters naming, added qld param as equivalent to ql

c64d03b

lintool closed this Nov 28, 2018

tteofili mentioned this pull request Nov 30, 2018

add Jelinek-Mercer (Dirichlet) QL scoring #491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to use Jelinek-Mercer QL scoring model #465

Make it possible to use Jelinek-Mercer QL scoring model #465

tteofili commented Nov 2, 2018

Peilin-Yang Nov 2, 2018

tteofili Nov 2, 2018 •

edited

Loading

lintool Nov 2, 2018

tteofili Nov 2, 2018

Peilin-Yang Nov 2, 2018

lintool commented Nov 2, 2018

tteofili commented Nov 2, 2018 •

edited

Loading

lintool commented Nov 2, 2018

tteofili commented Nov 2, 2018

lintool commented Nov 2, 2018

lintool commented Nov 21, 2018

tteofili commented Nov 21, 2018

lintool commented Nov 28, 2018

tteofili commented Nov 29, 2018

Make it possible to use Jelinek-Mercer QL scoring model #465

Make it possible to use Jelinek-Mercer QL scoring model #465

Conversation

tteofili commented Nov 2, 2018

Peilin-Yang Nov 2, 2018

Choose a reason for hiding this comment

tteofili Nov 2, 2018 • edited Loading

Choose a reason for hiding this comment

lintool Nov 2, 2018

Choose a reason for hiding this comment

tteofili Nov 2, 2018

Choose a reason for hiding this comment

Peilin-Yang Nov 2, 2018

Choose a reason for hiding this comment

lintool commented Nov 2, 2018

tteofili commented Nov 2, 2018 • edited Loading

lintool commented Nov 2, 2018

tteofili commented Nov 2, 2018

lintool commented Nov 2, 2018

lintool commented Nov 21, 2018

tteofili commented Nov 21, 2018

lintool commented Nov 28, 2018

tteofili commented Nov 29, 2018

tteofili Nov 2, 2018 •

edited

Loading

tteofili commented Nov 2, 2018 •

edited

Loading