Expand, Highlight, Generate: RL-Driven Document Generation for Passage Reranking

This repository contains the official code for the EMNLP 2023 paper, "Expand, Highlight, Generate: RL-Driven Document Generation for Passage Reranking," which has been accepted at the main track of EMNLP 2023.

If you want to cite this dataset, please use the following bibtex references:

@inproceedings{askari-etal-2023-expand,
    title = "Expand, Highlight, Generate: {RL}-driven Document Generation for Passage Reranking",
    author = "Askari, Arian  and
      Aliannejadi, Mohammad  and
      Meng, Chuan  and
      Kanoulas, Evangelos  and
      Verberne, Suzan",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.623",
    pages = "10087--10099",
}

DocGen: (1) Expand, (2) Highlight, (3) Generate

Explore the capabilities of our DocGen pipeline by running EMNLP23-ChatGPT-RetrievalQA-Document-Generator-Demo.ipynb. This notebook provides an example of the DocGen pipeline, showcasing the following steps:

Expanding a query.
Highlighting its tokens.
Generating a synthetic document.

Furhtermore, we provide example of experimenting with different highlighting tokens such as "<>", "*", "()".

Generated Data

Check out the generated data, including synthetic expanded queries, highlighted queries, and generated documents, by exploring the generated_data directory.

DocGen-RL

We use RL4LM for this aim and release the cleaned implementation soon.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
generated_data		generated_data
slides		slides
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Expand, Highlight, Generate: RL-Driven Document Generation for Passage Reranking

DocGen: (1) Expand, (2) Highlight, (3) Generate

Generated Data

DocGen-RL

About

Releases

Packages

Languages

arian-askari/docgen-EMNLP2023

Folders and files

Latest commit

History

Repository files navigation

Expand, Highlight, Generate: RL-Driven Document Generation for Passage Reranking

DocGen: (1) Expand, (2) Highlight, (3) Generate

Generated Data

DocGen-RL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages