Awesome Question Answering
-
MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering https://proceedings.mlr.press/v174/pal22a.html
-
Medical Exam Question Answering with Large-scale Reading Comprehension https://arxiv.org/pdf/1802.10279.pdf
-
PubMedQA: A Dataset for Biomedical Research Question Answering https://www.aclweb.org/anthology/D19-1259.pdf
-
emrQA: A Large Corpus for Question Answering on Electronic Medical Records https://www.aclweb.org/anthology/D18-1258.pdf
-
QASC: A Dataset for Question Answering via Sentence Composition https://arxiv.org/pdf/1910.11473.pdf
-
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension https://arxiv.org/pdf/1910.00458.pdf
-
SOCIAL IQA: Commonsense Reasoning about Social Interactions https://arxiv.org/pdf/1904.09728.pdf
-
Improving Question Answering with External Knowledge https://arxiv.org/pdf/1902.00993.pdf
-
Question Answering as Global Reasoning over Semantic Abstractions https://arxiv.org/pdf/1906.03672.pdf
-
Automatic Question Answering for Medical MCQs: Can It Go Further than Information Retrieval? https://www.aclweb.org/anthology/R19-1049.pdf
-
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering https://www.aclweb.org/anthology/W19-5039.pdf
-
HEAD-QA: A Healthcare Dataset for Complex Reasoning https://www.aclweb.org/anthology/P19-1092.pdf
-
Improving Retrieval-Based Question Answering with Deep Inference Models https://arxiv.org/pdf/1812.02971.pdf
-
Explain Yourself! Leveraging Language Models for Commonsense Reasoning https://arxiv.org/pdf/1906.02361.pdf
-
GenNet : Reading Comprehension with Multiple Choice Questions using Generation and Selection model https://arxiv.org/abs/2003.04360
-
Rethinking the Value of Transformer Components https://www.aclweb.org/anthology/2020.coling-main.529.pdf
-
A Corpus for Evidence Based Medicine Summarisation https://www.aclweb.org/anthology/U10-1012.pdf
-
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-015-0564-6
-
Question-Driven Summarization of Answers to Consumer Health Questions https://arxiv.org/pdf/2005.09067.pdf
-
A QUESTION-ENTAILMENT APPROACH TO QUESTION ANSWERING https://arxiv.org/pdf/1901.08079.pdf
-
A dataset of clinically generated visual questions and answers about radiology images https://www.nature.com/articles/sdata2018251
-
Overview of the Medical Question Answering Task at TREC 2017 LiveQA https://lhncbc.nlm.nih.gov/system/files/pub9773.pdf
-
A Survey of Datasets for Biomedical Question Answering Systems https://thesai.org/Downloads/Volume8No7/Paper_67-A_Survey_of_Datasets_for_Biomedical_Question.pdf
-
Lessons from Natural Language Inference in the Clinical Domain https://arxiv.org/pdf/1808.06752.pdf
-
Beyond SQuAD: How to Apply a Transformer QA Model to Your Data https://qa.fastforwardlabs.com/domain%20adaptation/transfer%20learning/specialized%20datasets/qa/medical%20qa/2020/07/22/QA-for-Specialized-Data.html
-
Applying deep matching networks to Chinese medical question answering: a study and a dataset https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-019-0761-8
-
The AI2 Reasoning Challenge (ARC) dataset http://ai2-website.s3.amazonaws.com/publications/AI2ReasoningChallenge2018.pdf
-
Interpretation of Natural Language Rules in Conversational Machine Reading https://arxiv.org/pdf/1809.01494.pdf
-
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text https://www.aclweb.org/anthology/D13-1020.pdf
-
The Complexity of Math Problems – Linguistic, or Computational? https://uclnlp.github.io/ai4exams/_papers/I13-1009.pdf
-
University Entrance Examinations as a Benchmark Resource for NLP-based Problem Solving https://uclnlp.github.io/ai4exams/_papers/I13-1192.pdf
-
Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving https://uclnlp.github.io/ai4exams/_papers/todai_overview.pdf
-
Machine Comprehension with Discourse Relations https://uclnlp.github.io/ai4exams/_papers/P15-1121.pdf
-
Machine Comprehension with Syntax, Frames, and Semantics https://uclnlp.github.io/ai4exams/_papers/P15-2115.pdf
-
Learning Answer-Entailing Structures for Machine Comprehension https://uclnlp.github.io/ai4exams/_papers/P15-1024.pdf
-
A Strong Lexical Matching Method for the Machine Comprehension Test https://uclnlp.github.io/ai4exams/_papers/D15-1197.pdf
-
Teaching Machines to Read and Comprehend https://arxiv.org/pdf/1506.03340.pdf
-
TOWARDS AI-COMPLETE QUESTION ANSWERING : A SET OF PREREQUISITE TOY TASKS https://arxiv.org/pdf/1502.05698.pdf
-
THE GOLDILOCKS PRINCIPLE: READING CHILDREN’S BOOKS WITH EXPLICIT MEMORY REPRESENTATIONS https://research.fb.com/wp-content/uploads/2016/11/the_goldilocks_principle_reading_children_s_books_with_explicit_memory_representations.pdf?
-
Machine Comprehension Based on Learning to Rank https://uclnlp.github.io/ai4exams/_papers/1605.03284v2.pdf
-
Attention-Based Convolutional Neural Network for Machine Comprehension https://uclnlp.github.io/ai4exams/_papers/1602.04341v1.pdf
-
Dynamic Entity Representation with Max-pooling Improves Machine Reading https://uclnlp.github.io/ai4exams/_papers/N16-1099.pdf
-
A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data https://uclnlp.github.io/ai4exams/_papers/1603.08884.pdf
-
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task https://uclnlp.github.io/ai4exams/_papers/1606.02858v1.pdf
-
SQuAD: 100,000+ Questions for Machine Comprehension of Text https://uclnlp.github.io/ai4exams/_papers/1606.05250v1.pdf
-
CliCR: A Dataset of Clinical Case Reports for Machine Reading Comprehension https://arxiv.org/pdf/1803.09720.pdf
-
CODAH: An Adversarially Authored Question-Answer Dataset for Common Sense https://arxiv.org/pdf/1904.04365.pdf
-
CoQA: A Conversational Question Answering Challenge https://arxiv.org/pdf/1808.07042.pdf
-
HOTPOTQA: A Dataset for Diverse, Explainable Multi-hop Question Answering https://www.aclweb.org/anthology/D18-1259.pdf
-
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset https://arxiv.org/pdf/1611.09268.pdf
-
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences https://www.aclweb.org/anthology/N18-1023.pdf
-
Natural Questions: a Benchmark for Question Answering Research https://persagen.com/files/misc/kwiatkowski2019natural.pdf
-
NEWSQA: A MACHINE COMPREHENSION DATASET https://arxiv.org/pdf/1611.09830.pdf
-
Constructing Datasets for Multi-hop Reading Comprehension Across Documents https://arxiv.org/pdf/1710.06481.pdf
-
QuAC : Question Answering in Context https://arxiv.org/pdf/1808.07036.pdf
-
RACE: Large-scale ReAding Comprehension Dataset From Examinations https://arxiv.org/pdf/1704.04683.pdf
-
LSDSem 2017 Shared Task: The Story Cloze Test http://aclweb.org/anthology/W17-0906.pdf
-
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference https://arxiv.org/abs/1808.05326
-
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes https://www.aclweb.org/anthology/D18-1166.pdf
-
The NarrativeQA Reading Comprehension Challenge https://arxiv.org/pdf/1712.07040.pdf
-
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs https://arxiv.org/pdf/1903.00161.pdf
-
DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension https://arxiv.org/pdf/1804.07927.pdf
-
COSMOS QA: Machine Reading Comprehension with Contextual Commonsense Reasoning https://arxiv.org/pdf/1909.00277.pdf
-
RECLOR: A READING COMPREHENSION DATASET REQUIRING LOGICAL REASONING https://openreview.net/pdf?id=HJgJtT4tvB
-
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications https://www.aclweb.org/anthology/W18-2605.pdf
-
QUASAR: DATASETS FOR QUESTION ANSWERING BY SEARCH AND READING https://arxiv.org/pdf/1707.03904.pdf
-
SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine https://arxiv.org/pdf/1704.05179.pdf
-
9th Challenge on Question Answering over Linked Data (QALD-9) http://ceur-ws.org/Vol-2241/paper-06.pdf
-
Good for figures and other ideas https://qa.fastforwardlabs.com/domain%20adaptation/transfer%20learning/specialized%20datasets/qa/medical%20qa/2020/07/22/QA-for-Specialized-Data.html
-
BioASQ: A Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering http://bioasq.org/resources/papers/Tsatsaronis_et_al_AAAIIRKD2012_CR.pdf
-
COVID-QA: A Question Answering Dataset for COVID-19 https://www.aclweb.org/anthology/2020.nlpcovid19-acl.18.pdf
-
Semantic Parsing on Freebase from Question-Answer Pairs https://cs.stanford.edu/~pliang/papers/freebase-emnlp2013.pdf
-
Large-scale Simple Question Answering with Memory Network https://arxiv.org/pdf/1506.02075.pdf
-
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus https://www.aclweb.org/anthology/P16-1056.pdf
-
On Generating Characteristic-rich Question Sets for QA Evaluation https://sites.cs.ucsb.edu/~ysu/papers/emnlp16_graphquestions.pdf
-
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension https://www.aclweb.org/anthology/P17-1147.pdf
-
LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs http://jens-lehmann.org/files/2017/iswc_lcquad.pdf
-
Know What You Don’t Know: Unanswerable Questions for SQuAD https://arxiv.org/pdf/1806.03822.pdf
-
The Web as a Knowledge-base for Answering Complex Questions https://arxiv.org/abs/1803.06643
-
FreebaseQA: A New Factoid QA Data Set Matching Trivia-Style Question-Answer Pairs with Freebase https://www.aclweb.org/anthology/N19-1028.pdf
-
ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters https://arxiv.org/pdf/1809.09528.pdf
-
Natural Questions: A Benchmark for Question Answering Research https://www.aclweb.org/anthology/Q19-1026.pdf
-
MEASURING COMPOSITIONAL GENERALIZATION: A COMPREHENSIVE METHOD ON REALISTIC DATA https://arxiv.org/pdf/1912.09713.pdf
-
Large-scale Semantic Parsing via Schema Matching and Lexicon Extension https://www.aclweb.org/anthology/P13-1042.pdf
-
WIKIQA: A Challenge Dataset for Open-Domain Question Answering https://pdfs.semanticscholar.org/8685/671ad8b1b1b5fe2b108c7002662b582ba277.pdf?_ga=2.255219788.515480105.1609930430-1235984517.1600607025&_gac=1.114743541.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
Good Question! Statistical Ranking for Question Generation https://www.aclweb.org/anthology/N10-1086.pdf
-
Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model https://www.aclweb.org/anthology/E17-1036.pdf
-
Learning to Ask: Neural Question Generation for Reading Comprehension https://arxiv.org/pdf/1705.00106.pdf
-
Neural Question Generation from Text: A Preliminary Study https://arxiv.org/pdf/1704.01792.pdf
-
Machine Comprehension by Text-to-Text Neural Question Generation https://arxiv.org/pdf/1705.02012.pdf
-
Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks https://www.aclweb.org/anthology/D18-1424.pdf
-
Capturing Greater Context for Question Generation https://arxiv.org/pdf/1910.10274.pdf
-
Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus https://arxiv.org/pdf/2002.00748.pdf
-
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps https://arxiv.org/pdf/2011.01060.pdf
-
WIKIREADING: A Novel Large-scale Language Understanding Task over Wikipedia https://www.aclweb.org/anthology/P16-1145.pdf
-
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering https://arxiv.org/pdf/2007.15207.pdf
-
KQA Pro: A Large-Scale Dataset with Interpretable Programs and Accurate SPARQLs for Complex Question Answering over Knowledge Base https://arxiv.org/pdf/2007.03875.pdf
-
AMBIGQA: Answering Ambiguous Open-domain Questions https://arxiv.org/pdf/2004.10645.pdf
-
TYDI QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages https://arxiv.org/pdf/2003.05002.pdf
-
MLQA: Evaluating Cross-lingual Extractive Question Answering https://arxiv.org/pdf/1910.07475.pdf
-
BREAK It Down: A Question Understanding Benchmark https://arxiv.org/pdf/2001.11770v1.pdf
-
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering https://www.aclweb.org/anthology/2020.acl-main.411.pdf
-
RuBQ: A Russian Dataset for Question Answering over Wikidata https://arxiv.org/pdf/2005.10659.pdf
-
Audio Visual Scene-Aware Dialog https://arxiv.org/pdf/1901.09107.pdf
-
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning https://www.aclweb.org/anthology/2020.emnlp-main.185.pdf
-
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA https://arxiv.org/pdf/2005.03356.pdf
-
MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization https://www.aclweb.org/anthology/2020.acl-main.330.pdf
-
Event-QA: A Dataset for Event-Centric Question Answering over Knowledge Graphs https://arxiv.org/pdf/2004.11861.pdf
-
SelQA: A New Benchmark for Selection-based Question Answering https://arxiv.org/pdf/1606.08513.pdf
-
Search-based Neural Structured Learning for Sequential Question Answering https://people.cs.umass.edu/~miyyer/pubs/2017_acl_dynsp.pdf
-
Compositional Semantic Parsing on Semi-Structured Tables https://arxiv.org/pdf/1508.00305.pdf
-
Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17181/15750
-
HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data https://www.aclweb.org/anthology/2020.findings-emnlp.91.pdf
-
Variational Reasoning for Question Answering with Knowledge Graph https://arxiv.org/pdf/1709.04071.pdf
-
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms https://arxiv.org/pdf/1905.13319.pdf
-
A Span-Extraction Dataset for Chinese Machine Reading Comprehension https://www.aclweb.org/anthology/D19-1600.pdf
-
DRCD: a Chinese Machine Reading Comprehension Dataset https://arxiv.org/ftp/arxiv/papers/1806/1806.00920.pdf
-
KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension https://arxiv.org/pdf/1909.07005.pdf
-
SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis https://arxiv.org/pdf/1912.09723.pdf
-
FQuAD: French Question Answering Dataset https://www.aclweb.org/anthology/2020.findings-emnlp.107.pdf
-
On the Cross-lingual Transferability of Monolingual Representations https://arxiv.org/pdf/1910.11856.pdf
-
Neural Arabic Question Answering https://arxiv.org/pdf/1906.05394.pdf
-
LC-QuAD 2.0: A Large Dataset for Complex Question Answering over Wikidata and DBpedia http://jens-lehmann.org/files/2019/iswc_lcquad2.pdf
-
Deep Bidirectional Transformers for Italian Question Answering http://ceur-ws.org/Vol-2481/paper25.pdf
-
ELI5: Long Form Question Answering https://arxiv.org/pdf/1907.09190.pdf
-
QUOREF: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning https://www.aclweb.org/anthology/D19-1606.pdf
-
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning http://vision.stanford.edu/pdf/johnson2017cvpr.pdf
-
A Corpus for Reasoning About Natural Language Grounded in Photographs https://arxiv.org/pdf/1811.00491.pdf
-
FVQA: Fact-based Visual Question Answering https://arxiv.org/pdf/1606.05433.pdf
-
APPLYING DEEP LEARNING TO ANSWER SELECTION: A STUDY AND AN OPEN TASK https://arxiv.org/pdf/1508.01585.pdf
-
PIQA: Reasoning about Physical Commonsense in Natural Language https://arxiv.org/pdf/1911.11641.pdf
-
Large-Scale QA-SRL Parsing https://arxiv.org/pdf/1805.05377v1.pdf
-
Zero-Shot Relation Extraction via Reading Comprehension https://arxiv.org/pdf/1706.04115.pdf
-
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning https://arxiv.org/pdf/1703.06585.pdf
-
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering https://arxiv.org/pdf/1906.02467.pdf
-
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge https://arxiv.org/pdf/1803.05457.pdf
-
Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems https://arxiv.org/pdf/1705.04146.pdf
-
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions https://arxiv.org/pdf/1905.10044.pdf
-
CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense https://arxiv.org/pdf/1904.04365.pdf
-
COMMONSENSEQA: A Question Answering Challenge Targeting Commonsense Knowledge https://arxiv.org/pdf/1811.00937.pdf
-
DVQA: Understanding Data Visualizations via Question Answering https://arxiv.org/pdf/1801.08163.pdf
-
What’s in an Explanation? Characterizing Knowledge and Inference Requirements for Elementary Science Exams https://www.aclweb.org/anthology/C16-1278.pdf
-
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering https://arxiv.org/pdf/1902.09506.pdf
-
DIALOGUE LEARNING WITH HUMAN-IN-THE-LOOP https://arxiv.org/pdf/1611.09823.pdf
-
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering https://www.aclweb.org/anthology/D18-1260.pdf
-
Tracking State Changes in Procedural Text: A Challenge Dataset and Models for Process Paragraph Comprehension https://arxiv.org/pdf/1805.06975.pdf
-
QUAREL: A Dataset and Models for Answering Questions about Qualitative Relationships https://arxiv.org/pdf/1811.08048.pdf
-
QUARTZ: An Open-Domain Dataset of Qualitative Relationship Questions https://www.aclweb.org/anthology/D19-1608.pdf
-
ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension https://arxiv.org/pdf/1810.12885.pdf
-
Crowdsourcing Multiple Choice Science Questions https://arxiv.org/pdf/1707.06209.pdf
-
SemEval-2016 Task 3: Community Question Answering https://www.aclweb.org/anthology/S16-1083.pdf
-
Swag: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference https://www.aclweb.org/anthology/D18-1009.pdf
-
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence https://openaccess.thecvf.com/content_CVPR_2019/papers/Zadeh_Social-IQ_A_Question_Answering_Benchmark_for_Artificial_Social_Intelligence_CVPR_2019_paper.pdf
-
Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension http://ai2-website.s3.amazonaws.com/publications/CVPR17_TQA.pdf
-
Towards VQA Models That Can Read https://arxiv.org/pdf/1904.08920.pdf
-
Dialog-based Language Learning https://arxiv.org/pdf/1604.06045.pdf
-
DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset https://www.aclweb.org/anthology/I17-1099.pdf
-
Large-scale Simple Question Answering with Memory Networks https://arxiv.org/pdf/1506.02075.pdf
-
Key-Value Memory Networks for Directly Reading Documents https://arxiv.org/pdf/1606.03126.pdf
-
What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA https://www.aclweb.org/anthology/D07-1003.pdf
-
From Recognition to Cognition: Visual Commonsense Reasoning https://arxiv.org/pdf/1811.10830.pdf
-
VQA: Visual Question Answering https://arxiv.org/pdf/1505.00468.pdf
-
Who did What: A Large-Scale Person-Centered Cloze Dataset https://arxiv.org/pdf/1608.05457.pdf
-
HOVER: A Dataset for Many-Hop Fact Extraction And Claim Verification https://www.aclweb.org/anthology/2020.findings-emnlp.309.pdf
-
IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding https://arxiv.org/pdf/2009.05387.pdf
-
LAReQA: Language-agnostic answer retrieval from a multilingual pool https://arxiv.org/pdf/2004.05484.pdf
-
SUBJQA: A Dataset for Subjectivity and Review Comprehension https://www.aclweb.org/anthology/2020.emnlp-main.442.pdf
-
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering https://www.aclweb.org/anthology/2020.emnlp-main.438.pdf
-
Open-Retrieval Conversational Question Answering https://arxiv.org/pdf/2005.11364.pdf
-
Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases https://arxiv.org/pdf/2011.07743.pdf
-
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark https://arxiv.org/pdf/2010.15925.pdf
-
OPEN QUESTION ANSWERING OVER TABLES AND TEXT https://arxiv.org/pdf/2010.10439.pdf
-
LiveQA: A Question Answering Dataset over Sports Live https://arxiv.org/pdf/2010.00526.pdf
-
Inquisitive Question Generation for High Level Text Comprehension https://arxiv.org/pdf/2010.01657.pdf
-
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension https://arxiv.org/pdf/1904.09679.pdf
-
LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning https://arxiv.org/pdf/2007.08124.pdf
-
QED: A Framework and Dataset for Explanations in Question Answering https://arxiv.org/pdf/2009.06354.pdf
-
ConvAI3: Generating Clarifying Questions for Open-Domain Dialogue Systems (ClariQ) https://arxiv.org/pdf/2009.11352.pdf
-
Visual Genome https://visualgenome.org/static/paper/Visual_Genome.pdf
-
VisualCOMET: Reasoning about the Dynamic Context of a Still Image https://arxiv.org/pdf/2004.10796.pdf
-
ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET https://arxiv.org/pdf/1808.02280.pdf
-
Enhancing Lexical-Based Approach with External Knowledge for Vietnamese Multiple-Choice Machine Reading Comprehension https://arxiv.org/pdf/2001.05687.pdf
-
FIGUREQA: AN ANNOTATED FIGURE DATASET FOR VISUAL REASONING https://arxiv.org/pdf/1710.07300.pdf
-
TVQA: Localized, Compositional Video Question Answering https://arxiv.org/pdf/1809.01696.pdf
-
MovieQA: Understanding Stories in Movies through Question-Answering https://arxiv.org/pdf/1512.02902.pdf
-
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering https://arxiv.org/pdf/1704.04497.pdf
-
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions https://www.microsoft.com/en-us/research/uploads/prod/2020/06/SQuINT_CVPR.pdf
-
Open Domain Web Keyphrase Extraction Beyond Language Modeling https://arxiv.org/pdf/1911.02671.pdf
-
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension https://arxiv.org/pdf/1902.00164.pdf
-
STARC: Structured Annotations for Reading Comprehension https://arxiv.org/pdf/2004.14797.pdf
-
ANTIQUE: A Non-Factoid Question Answering Benchmark https://arxiv.org/pdf/1905.08957.pdf
-
Quizbowl: The Case for Incremental Question Answering https://arxiv.org/pdf/1904.04792.pdf
-
Evaluating Theory of Mind in Question Answering https://arxiv.org/pdf/1808.09352.pdf
-
SemEval-2018 Task 5: Counting Events and Participants in the Long Tail https://www.aclweb.org/anthology/S18-1009.pdf
-
Code-Mixed Question Answering Challenge: Crowd-sourcing Data and Techniques https://www.aclweb.org/anthology/W18-3204.pdf
-
Modeling Ambiguity, Subjectivity, and Diverging Viewpoints in Opinion Question Answering Systems http://cseweb.ucsd.edu/~jmcauley/pdfs/icdm16c.pdf
-
WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval https://arxiv.org/pdf/1805.03797.pdf
-
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations https://ciir-publications.cs.umass.edu/pub/web/getpdf.php?id=1339
-
POIReviewQA: A Semantically Enriched POI Retrieval and estion Answering Dataset https://arxiv.org/pdf/1810.02802.pdf
-
Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia https://www.aclweb.org/anthology/C18-1178.pdf
-
Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia https://arxiv.org/pdf/1805.05942.pdf
-
QALM: a Benchmark for Question Answering over Linked Merchant Websites Data http://ceur-ws.org/Vol-1272/paper_113.pdf
-
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList https://www.aclweb.org/anthology/2020.acl-main.442.pdf
-
Medical Exam Question Answering with Large-Scale Reading file:///Users/monk/Downloads/16582-76893-1-PB%20(1).pdf
-
Biomedical Question Answering with SDNet https://web.stanford.edu/class/cs224n/reports/custom/15743952.pdf
-
On the Role of Question Summarization and Information Source Restriction in Consumer Health Question Answering https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6568117/
-
An ontology for clinical questions about the contents of patient notes sciencedirect.com/science/article/pii/S1532046411001961
-
Verification of the Expected Answer Type for Biomedical Question Answering https://hal.archives-ouvertes.fr/hal-01759306/document
-
Adapting and evaluating a deep learning language model for clinical why-question answering http://shorturl.at/epvJS
-
JEC-QA: A Legal-Domain Question Answering Dataset https://arxiv.org/pdf/1911.12011.pdf
-
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions https://arxiv.org/pdf/1809.03707.pdf
-
TWEETQA: A Social Media Focused Question Answering Dataset https://www.aclweb.org/anthology/P19-1496.pdf
-
What do Models Learn from Question Answering Datasets? https://www.aclweb.org/anthology/2020.emnlp-main.190.pdf
-
AmazonQA: A Review-Based Question Answering Task https://arxiv.org/pdf/1908.04364.pdf
-
Just Ask: Learning to Answer Questions from Millions of Narrated Videos https://arxiv.org/pdf/2012.00451.pdf
-
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning https://www.aclweb.org/anthology/2020.emnlp-main.85.pdf
-
CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering https://arxiv.org/pdf/2010.16021.pdf
-
Retrieving and Reading : A Comprehensive Survey on Open-domain Question Answering https://arxiv.org/pdf/2101.00774.pdf
-
How Can We Know When Language Models Know? https://arxiv.org/pdf/2012.00955.pdf
-
Understanding Dataset Design Choices for Multi-hop Reasoning https://arxiv.org/pdf/1904.12106.pdf
-
Cross-Lingual Transfer Learning for Question Answerin https://arxiv.org/pdf/1907.06042.pdf
-
LEAF-QA: Locate, Encode & Attend for Figure Question Answering https://arxiv.org/pdf/1907.12861.pdf
-
The TechQA Dataset https://arxiv.org/pdf/1911.02984.pdf
-
MANYMODALQA: Modality Disambiguation and QA over Diverse Inputs https://arxiv.org/pdf/2001.08034.pdf
-
Look at the First Sentence: Position Bias in Question Answering https://arxiv.org/pdf/2004.14602.pdf
-
DoQA - Accessing Domain-Specific FAQs via Conversational QA https://arxiv.org/pdf/2005.01328.pdf
-
MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models https://arxiv.org/pdf/2005.02507.pdf
-
PolicyQA: A Reading Comprehension Dataset for Privacy Policies https://www.aclweb.org/anthology/2020.findings-emnlp.66.pdf
-
GoEmotions: A Dataset of Fine-Grained Emotions https://www.aclweb.org/anthology/2020.acl-main.372.pdf
-
Revisiting the Open-Domain Question Answering Pipeline https://arxiv.org/pdf/2009.00914.pdf
-
When in Doubt, Ask: Generating Answerable and Unanswerable Questions, Unsupervised https://arxiv.org/pdf/2010.01611.pdf
-
A question-answering system for aircraft pilots’ documentation https://arxiv.org/pdf/2011.13284.pdf
-
FORECASTQA: A Question Answering Challenge for Event Forecasting https://arxiv.org/pdf/2005.00792.pdf
-
A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection https://arxiv.org/pdf/2003.02349.pdf
-
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing https://arxiv.org/pdf/1506.07285.pdf
-
AN EFFICIENT PASSAGE RANKING TECHNIQUE FOR A QA SYSTEM https://pdfs.semanticscholar.org/ec9b/120c4022cebcf8b02e76e56eb00406466ba7.pdf?_ga=2.134633397.515480105.1609930430-1235984517.1600607025&_gac=1.225407464.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
QA-It: Classifying Non-Referential It for Question Answer Pairs https://pdfs.semanticscholar.org/2edb/6575fda3a18db72b34a99702886826cd7093.pdf?_ga=2.134633397.515480105.1609930430-1235984517.1600607025&_gac=1.225407464.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset https://dl.acm.org/doi/pdf/10.1145/3281354.3281359
-
A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset https://www.aclweb.org/anthology/W18-2607.pdf
-
How You Ask Matters: The Effect of Paraphrastic Questions to BERT Performance on a Clinical SQuAD Dataset https://www.aclweb.org/anthology/2020.clinicalnlp-1.13.pdf
-
Blindfold Baselines for Embodied QA https://arxiv.org/pdf/1811.05013.pdf
-
DeepStory: Video Story QA by Deep Embedded Memory Networks https://arxiv.org/pdf/1707.00836.pdf
-
ScholarlyRead: A New Dataset for Scientific Article Reading Comprehension https://pdfs.semanticscholar.org/ce06/1efbc055c8d1fe491b19161e935254a424c5.pdf?_ga=2.53360748.515480105.1609930430-1235984517.1600607025&_gac=1.180171990.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
ReCO: A Large Scale Chinese Reading Comprehension Dataset on Opinion https://arxiv.org/pdf/2006.12146.pdf
-
Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset https://arxiv.org/pdf/2005.00574.pdf
-
XOR QA: Cross-lingual Open-Retrieval Question Answering https://arxiv.org/pdf/2010.11856.pdf
-
FriendsQA: Open-Domain Question Answering on TV Show Transcripts https://pdfs.semanticscholar.org/dbee/2139c68932de618ad1d01844fdb0e90ada5b.pdf?_ga=2.28728184.515480105.1609930430-1235984517.1600607025&_gac=1.215503205.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
Frustratingly Easy Natural Question Answering https://arxiv.org/pdf/1909.05286.pdf
-
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering https://arxiv.org/pdf/1511.05960.pdf
-
FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension https://arxiv.org/pdf/1908.05117.pdf
-
AnswerFact: Fact Checking in Product Question Answering∗ https://pdfs.semanticscholar.org/4c61/df1b4b9a164fec1a34587b4fffae029cd18c.pdf?_ga=2.58593390.515480105.1609930430-1235984517.1600607025&_gac=1.180401621.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
Matching Questions and Answers in Dialogues from Online Forums https://arxiv.org/pdf/2005.09276.pdf
-
Question Answering with Long Multiple-Span Answers https://pdfs.semanticscholar.org/4e76/7b191683dca0ea6d33216508159886144da4.pdf?_ga=2.58593390.515480105.1609930430-1235984517.1600607025&_gac=1.180401621.1607092635.Cj0KCQiA2af-BRDzARIsAIVQUOdNiV5qT_0YS1w4gKgpaTrSKxbqsipzwnWWzkdgQU7V98y7gdbJcUQaAnquEALw_wcB
-
Measuring Emotions in the COVID-19 Real World Worry Dataset https://www.aclweb.org/anthology/2020.nlpcovid19-acl.11.pdf
-
What Are People Asking About COVID-19? A Question Classification Dataset https://www.aclweb.org/anthology/2020.nlpcovid19-acl.8.pdf
-
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines https://www.aclweb.org/anthology/2020.nlp4convai-1.13.pdf
-
Developing a How-to Tip Machine Comprehension Dataset and its Evaluation in Machine Comprehension by BERT https://www.aclweb.org/anthology/2020.fever-1.4.pdf
-
BIOMRC: A Dataset for Biomedical Machine Reading Comprehension https://www.aclweb.org/anthology/2020.bionlp-1.15.pdf
-
CIMA: A Large Open Access Dialogue Dataset for Tutoring https://www.aclweb.org/anthology/2020.bea-1.5.pdf
-
Building a Japanese Typo Dataset from Wikipedia’s Revision History https://www.aclweb.org/anthology/2020.acl-srw.31.pdf
-
SCIREX: A Challenge Dataset for Document-Level Information Extraction https://www.aclweb.org/anthology/2020.acl-main.670.pdf
-
ClarQ: A large-scale and diverse dataset for Clarification Question Generation https://www.aclweb.org/anthology/2020.acl-main.651.pdf
-
SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations https://www.aclweb.org/anthology/2020.acl-main.502.pdf
-
Storytelling with Dialogue: A Critical Role Dungeons and Dragons Dataset https://www.aclweb.org/anthology/2020.acl-main.459.pdf
-
Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts https://www.aclweb.org/anthology/2020.acl-main.425.pdf
-
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotations of Modality https://www.aclweb.org/anthology/2020.acl-main.343.pdf
-
MIND: A Large-scale Dataset for News Recommendation https://www.aclweb.org/anthology/2020.acl-main.331.pdf
-
Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter https://www.aclweb.org/anthology/2020.acl-main.157.pdf
-
MuTual: A Dataset for Multi-Turn Dialogue Reasoning https://www.aclweb.org/anthology/2020.acl-main.130.pdf
-
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets https://www.aclweb.org/anthology/W19-5006.pdf
-
A distantly supervised dataset for automated data extraction from diagnostic studies https://www.aclweb.org/anthology/W19-5012.pdf
-
A High-Quality Multilingual Dataset for Structured Documentation Translation https://www.aclweb.org/anthology/W19-5212.pdf
-
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations https://www.aclweb.org/anthology/P19-1050.pdf
-
DocRED: A Large-Scale Document-Level Relation Extraction Dataset https://www.aclweb.org/anthology/P19-1074.pdf
-
ChID: A Large-scale Chinese IDiom Dataset for Cloze Test https://www.aclweb.org/anthology/P19-1075.pdf
-
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model https://www.aclweb.org/anthology/P19-1102.pdf
-
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue https://www.aclweb.org/anthology/P19-1184.pdf
-
TALKSUMM: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks https://www.aclweb.org/anthology/P19-1204.pdf
-
BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization https://www.aclweb.org/anthology/P19-1212.pdf
-
XQA: A Cross-lingual Open-domain Question Answering Dataset https://www.aclweb.org/anthology/P19-1227.pdf
-
Dataset Creation for Ranking Constructive News Comments https://www.aclweb.org/anthology/P19-1250.pdf
-
Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets https://www.aclweb.org/anthology/P19-1268.pdf
-
CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech https://www.aclweb.org/anthology/P19-1271.pdf
-
Large Dataset and Language Model Fun-Tuning for Humor Recognition https://www.aclweb.org/anthology/P19-1394.pdf
-
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets https://www.aclweb.org/anthology/P19-1435.pdf
-
NNE: A Dataset for Nested Named Entity Recognition in English Newswire https://www.aclweb.org/anthology/P19-1510.pdf
-
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset https://www.aclweb.org/anthology/P19-1534.pdf
-
Active Reading Comprehension: A dataset for learning the Question-Answer Relationship strategy https://www.aclweb.org/anthology/P19-2014.pdf
-
L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language https://www.aclweb.org/anthology/W19-3512.pdf
-
A Dataset for Noun Compositionality Detection for a Slavic Language https://www.aclweb.org/anthology/W19-3708.pdf
-
A Dataset for Semantic Role Labelling of Hindi-English Code-Mixed Tweets https://www.aclweb.org/anthology/W19-4020.pdf
-
A Repository of Conversational Datasets https://www.aclweb.org/anthology/W19-4101.pdf
-
Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning https://www.aclweb.org/anthology/P18-1238.pdf
-
Learning Translations via Images with a Massively Multilingual Image Dataset https://www.aclweb.org/anthology/P18-1239.pdf
-
SNAG: Spoken Narratives and Gaze Dataset https://www.aclweb.org/anthology/P18-2022.pdf
-
Automatic Article Commenting: the Task and Dataset https://www.aclweb.org/anthology/P18-2025.pdf
-
Predicting accuracy on large datasets from smaller pilot data https://www.aclweb.org/anthology/P18-2072.pdf
-
MeSH-based dataset for measuring the relevance of text retrieval https://www.aclweb.org/anthology/W18-2320.pdf
-
Systematic Error Analysis of the Stanford Question Answering Dataset https://www.aclweb.org/anthology/W18-2602.pdf
-
Japanese Sentence Compression with a Large Training Dataset https://www.aclweb.org/anthology/P17-2044.pdf
-
STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset https://www.aclweb.org/anthology/P17-2066.pdf
-
“Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection https://www.aclweb.org/anthology/P17-2067.pdf
-
The LAMBADA dataset: Word prediction requiring a broad discourse context∗ https://www.aclweb.org/anthology/P16-1144.pdf
-
IBC-C: A Dataset for Armed Conflict Event Analysis https://www.aclweb.org/anthology/P16-2061.pdf
-
Controlled and Balanced Dataset for Japanese Lexical Simplification https://www.aclweb.org/anthology/P16-3001.pdf
-
A Dataset for Joint Noun–Noun Compound Bracketing and Interpretation https://www.aclweb.org/anthology/P16-3011.pdf
-
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling https://www.aclweb.org/anthology/2020.emnlp-main.67.pdf
-
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions https://www.aclweb.org/anthology/2020.emnlp-main.86.pdf
-
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions https://www.aclweb.org/anthology/2020.emnlp-main.88.pdf
-
ToTTo: A Controlled Table-To-Text Generation Dataset https://www.aclweb.org/anthology/2020.emnlp-main.89.pdf
-
Hashtags, Emotions, and Comments: A Large-Scale Dataset to Understand Fine-Grained Social Emotions to Online Topics https://www.aclweb.org/anthology/2020.emnlp-main.106.pdf
-
MAVEN: A Massive General Domain Event Detection Dataset https://www.aclweb.org/anthology/2020.emnlp-main.129.pdf
-
CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French https://www.aclweb.org/anthology/2020.emnlp-main.141.pdf
-
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models https://www.aclweb.org/anthology/2020.emnlp-main.154.pdf
-
A Method for Building a Commonsense Inference Dataset based on Basic Events https://www.aclweb.org/anthology/2020.emnlp-main.192.pdf
-
Comparative Evaluation of Label-Agnostic Selection Bias in Multilingual Hate Speech Datasets https://www.aclweb.org/anthology/2020.emnlp-main.199.pdf
-
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation https://www.aclweb.org/anthology/2020.emnlp-main.207.pdf
-
TED-CDB: A Large-Scale Chinese Discourse Relation Dataset on TED Talks https://www.aclweb.org/anthology/2020.emnlp-main.223.pdf
-
A Visually-grounded First-person Dialogue Dataset with Verbal and Non-verbal Responses https://www.aclweb.org/anthology/2020.emnlp-main.267.pdf
-
X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset https://www.aclweb.org/anthology/2020.emnlp-main.321.pdf
-
CLIRMatrix: A massively large collection of bilingual and multilingual datasets for Cross-Lingual Information Retrieval https://www.aclweb.org/anthology/2020.emnlp-main.340.pdf
-
Introducing a New Dataset for Event Detection in Cybersecurity Texts https://www.aclweb.org/anthology/2020.emnlp-main.433.pdf
-
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation https://www.aclweb.org/anthology/2020.emnlp-main.484.pdf
-
Interpretable Multi-dataset Evaluation for Named Entity Recognition https://www.aclweb.org/anthology/2020.emnlp-main.489.pdf
-
A Dataset for Tracking Entities in Open Domain Procedural Text https://www.aclweb.org/anthology/2020.emnlp-main.520.pdf
-
STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation https://www.aclweb.org/anthology/2020.emnlp-main.525.pdf
-
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics https://www.aclweb.org/anthology/2020.emnlp-main.528.pdf
-
DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset https://www.aclweb.org/anthology/2020.emnlp-main.562.pdf
-
Textual Data Augmentation for Efficient Active Learning on Tiny Datasets https://www.aclweb.org/anthology/2020.emnlp-main.600.pdf
-
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge https://www.aclweb.org/anthology/2020.emnlp-main.611.pdf
-
Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles https://www.aclweb.org/anthology/2020.emnlp-main.648.pdf
-
Intrinsic Evaluation of Summarization Datasets https://www.aclweb.org/anthology/2020.emnlp-main.649.pdf
-
doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset https://www.aclweb.org/anthology/2020.emnlp-main.652.pdf
-
Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity https://www.aclweb.org/anthology/2020.emnlp-main.655.pdf
-
CANCEREMO : A Dataset for Fine-Grained Emotion Detection https://www.aclweb.org/anthology/2020.emnlp-main.715.pdf
-
Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations https://www.aclweb.org/anthology/2020.emnlp-main.717.pdf
-
MedDialog: Large-scale Medical Dialogue Datasets https://www.aclweb.org/anthology/2020.emnlp-main.743.pdf
-
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics https://www.aclweb.org/anthology/2020.emnlp-main.746.pdf
-
NeuralQA: A Usable Library for Question Answering (Contextual Query Expansion + BERT) on Large Datasets https://www.aclweb.org/anthology/2020.emnlp-demos.3.pdf
-
In Data We Trust: A Critical Analysis of Hate Speech Detection Datasets https://www.aclweb.org/anthology/2020.alw-1.18.pdf
-
On Cross-Dataset Generalization in Automatic Detection of Online Abuse https://www.aclweb.org/anthology/2020.alw-1.20.pdf
-
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining https://www.aclweb.org/anthology/2020.clinicalnlp-1.15.pdf
-
Multilingual Argument Mining: Datasets and Analysis https://www.aclweb.org/anthology/2020.findings-emnlp.29.pdf
-
The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification https://www.aclweb.org/anthology/2020.findings-emnlp.32.pdf
-
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding https://www.aclweb.org/anthology/2020.findings-emnlp.39.pdf
-
LiMiT: The Literal Motion in Text Dataset https://www.aclweb.org/anthology/2020.findings-emnlp.88.pdf
-
MedICaT: A Dataset of Medical Images, Captions, and Textual References https://www.aclweb.org/anthology/2020.findings-emnlp.191.pdf
-
Reinforcement Learning with Imbalanced Dataset for Data-to-Text Medical Report Generation https://www.aclweb.org/anthology/2020.findings-emnlp.202.pdf
-
Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles https://www.aclweb.org/anthology/2020.findings-emnlp.272.pdf
-
CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems https://www.aclweb.org/anthology/2020.findings-emnlp.329.pdf
-
Modeling Preconditions in Text with a Crowd-sourced Dataset https://www.aclweb.org/anthology/2020.findings-emnlp.340.pdf
-
WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization https://www.aclweb.org/anthology/2020.findings-emnlp.360.pdf
-
STANDER: An Expert-Annotated Dataset for News Stance Detection and Evidence Retrieval https://www.aclweb.org/anthology/2020.findings-emnlp.365.pdf
-
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder https://www.aclweb.org/anthology/2020.findings-emnlp.373.pdf
-
#Turki$hTweets: A Benchmark Dataset for Turkish Text Correction https://www.aclweb.org/anthology/2020.findings-emnlp.374.pdf
-
Contract Discovery: Dataset and a Few-Shot Semantic Retrieval Challenge with Competitive Baselines https://www.aclweb.org/anthology/2020.findings-emnlp.380.pdf
-
Automatic Term Name Generation for Gene Ontology: Task and Dataset https://www.aclweb.org/anthology/2020.findings-emnlp.422.pdf
-
Finding Friends and Flipping Frenemies: Automatic Paraphrase Dataset Augmentation Using Graph Theory https://www.aclweb.org/anthology/2020.findings-emnlp.426.pdf
-
Weibo-COV: A Large-Scale COVID-19 Social Media Dataset from Weibo https://www.aclweb.org/anthology/2020.nlpcovid19-2.34.pdf
-
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset https://www.aclweb.org/anthology/2020.sdp-1.5.pdf
-
Do We Need to Create Big Datasets to Learn a Task? https://www.aclweb.org/anthology/2020.sustainlp-1.23.pdf
-
Determining Question-Answer Plausibility in Crowdsourced Datasets Using Multi-Task Learning https://www.aclweb.org/anthology/2020.wnut-1.4.pdf
-
Civil Unrest on Twitter (CUT): A Dataset of Tweets to Support Research on Civil Unrest https://www.aclweb.org/anthology/2020.wnut-1.28.pdf
-
Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text https://www.aclweb.org/anthology/2020.emnlp-main.111.pdf
-
Querying Across Genres for Medical Claims in News https://www.aclweb.org/anthology/2020.emnlp-main.139.pdf
-
Biomedical Event Extraction with Hierarchical Knowledge Graphs https://www.aclweb.org/anthology/2020.findings-emnlp.114.pdf
-
Characterizing the Value of Information in Medical Notes https://www.aclweb.org/anthology/2020.findings-emnlp.187.pdf
-
Generating Accurate Electronic Health Assessment from Medical Graph https://www.aclweb.org/anthology/2020.findings-emnlp.336.pdf
-
KnowledgeNet: A Benchmark Dataset for Knowledge Base Population https://www.aclweb.org/anthology/D19-1069.pdf
-
GGPONC: A Corpus of German Medical Text with Rich Metadata Based on Clinical Practice Guidelines https://www.aclweb.org/anthology/2020.louhi-1.5.pdf
-
(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas https://www.aclweb.org/anthology/D19-1179.pdf
-
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases https://www.aclweb.org/anthology/D19-1204.pdf
-
UR-FUNNY: A Multimodal Language Dataset for Understanding Humor https://www.aclweb.org/anthology/D19-1211.pdf
-
BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels https://www.aclweb.org/anthology/D19-1249.pdf
-
PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification https://www.aclweb.org/anthology/D19-1382.pdf
-
Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach https://www.aclweb.org/anthology/D19-1404.pdf
-
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims https://www.aclweb.org/anthology/D19-1475.pdf
-
A Benchmark Dataset for Learning to Intervene in Online Hate Speech https://www.aclweb.org/anthology/D19-1482.pdf
-
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension https://www.aclweb.org/anthology/D19-1517.pdf
-
JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation https://www.aclweb.org/anthology/D19-1546.pdf
-
A Dataset of General-Purpose Rebuttal https://www.aclweb.org/anthology/D19-1561.pdf
-
Automatic Argument Quality Assessment - New Datasets and Methods https://www.aclweb.org/anthology/D19-1564.pdf
-
Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects https://www.aclweb.org/anthology/D19-1625.pdf
-
WIQA: A dataset for "What if..." reasoning over procedural text https://www.aclweb.org/anthology/D19-1629.pdf
-
The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English https://www.aclweb.org/anthology/D19-1632.pdf
-
A Challenge Dataset and Effective Models for Aspect-Based Sentiment Analysis https://www.aclweb.org/anthology/D19-1654.pdf
-
DENS: A Dataset for Multi-class Emotion Analysis https://www.aclweb.org/anthology/D19-1656.pdf
-
RUN through the Streets: A New Dataset and Baseline Models for Realistic Urban Navigation https://www.aclweb.org/anthology/D19-1681.pdf
-
Rumor Detection on Social Media: Datasets, Methods and Opportunities https://www.aclweb.org/anthology/D19-5008.pdf
-
WAT2019: English-Hindi Translation on Hindi Visual Genome Dataset https://www.aclweb.org/anthology/D19-5224.pdf
-
SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization https://www.aclweb.org/anthology/D19-5409.pdf
-
ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine Reading Comprehension https://www.aclweb.org/anthology/D19-5820.pdf
-
A Dataset of Crowdsourced Word Sequences: Collections and Answer Aggregation for Ground Truth Creation https://www.aclweb.org/anthology/D19-5904.pdf
-
Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual https://www.aclweb.org/anthology/D19-6115.pdf
-
Dreaddit: A Reddit Dataset for Stress Analysis in Social Media https://www.aclweb.org/anthology/D19-6213.pdf
-
Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs https://www.aclweb.org/anthology/D19-1631.pdf
-
MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation https://www.aclweb.org/anthology/D19-3024.pdf
-
Developing a Curated Topic Model for COVID-19 Medical Research Literature https://www.aclweb.org/anthology/2020.nlpcovid19-2.30.pdf
-
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets https://www.aclweb.org/anthology/D19-1107.pdf
-
Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines https://www.aclweb.org/anthology/D19-1122.pdf
-
Biomedical Event Extraction as Multi-turn Question Answering https://www.aclweb.org/anthology/2020.louhi-1.10.pdf
-
PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution https://www.aclweb.org/anthology/D18-1016.pdf
-
A Dataset for Document Grounded Conversations https://www.aclweb.org/anthology/D18-1076.pdf
-
A Dataset for Telling the Stories of Social Media Videos https://www.aclweb.org/anthology/D18-1117.pdf
-
A dataset and baselines for sequential open-domain question answering https://www.aclweb.org/anthology/D18-1134.pdf
-
CARD-660: Cambridge Rare Word Dataset – a Reliable Benchmark for Infrequent Word Representation Models https://www.aclweb.org/anthology/D18-1169.pdf
-
Large-scale Cloze Test Dataset Created by Teachers https://www.aclweb.org/anthology/D18-1257.pdf
-
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task https://www.aclweb.org/anthology/D18-1425.pdf
-
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation https://www.aclweb.org/anthology/D18-1514.pdf
-
MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling https://www.aclweb.org/anthology/D18-1547.pdf
-
KT-Speech-Crawler: Automatic Dataset Construction for Speech Recognition from YouTube Videos https://www.aclweb.org/anthology/D18-2016.pdf
-
APLenty: annotation tool for creating high-quality datasets using active and proactive learning https://www.aclweb.org/anthology/D18-2019.pdf
-
Hate Speech Dataset from a White Supremacy Forum https://www.aclweb.org/anthology/W18-5102.pdf
-
Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying https://www.aclweb.org/anthology/W18-5107.pdf
-
Datasets of Slovene and Croatian Moderated News Comments https://www.aclweb.org/anthology/W18-5116.pdf
-
Semantic role labeling tools for biomedical question answering: a study of selected tools on the BioASQ datasets https://www.aclweb.org/anthology/W18-5302.pdf
-
An Adaption of BIOASQ Question Answering dataset for Machine Reading systems by Manual Annotations of Answer Spans https://www.aclweb.org/anthology/W18-5309.pdf
-
UNCC QA: A Biomedical Question Answering System https://www.aclweb.org/anthology/W18-5308.pdf