Skip to content

Latest commit

 

History

History
101 lines (93 loc) · 5.27 KB

release-notes-v0.9.1.md

File metadata and controls

101 lines (93 loc) · 5.27 KB

Anserini Release Notes (v0.9.1)

Release date: May 6, 2020

  • Integrated metadata from CSV with JSON of full-text articles in CORD-19.
  • Renamed Covid* to Cord19* to more accurately name of corpus.
  • Updated support for CORD-19, up through data drop of 2020/05/01.
  • Added manual blacklist to skip outlier articles in CORD-19.
  • Added query generator (and output queries) from University of Delaware for TREC-COVID (round 1).
  • Added instructions for generating baseline runs for TREC-COVID (round 1).
  • Added topics for TREC-COVID (round 2).
  • Added collection support for 20Newsgroups.
  • Added support for taking stopwords from an external file.
  • Added ability to compute document frequency for phrases.
  • Added support for MS MARCO documents in Elasticsearch
  • Improved support for multiple vectors with same id in nearest neighbor search.
  • Fixed bug in Solrini regression for MS MARCO document.
  • Fixed out-of-date documentation for MS MARCO regressions.

Contributors (This Release)

Sorted by number of commits:

All Contributors

Sorted by number of commits, according to GitHub: