HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CoQA: A Conversational Question Answering Challenge

Siva Reddy; Danqi Chen; Christopher D. Manning

CoQA: A Conversational Question Answering Challenge

Abstract

Humans gather information by engaging in conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets, e.g., coreference and pragmatic reasoning. We evaluate strong conversational and reading comprehension models on CoQA. The best system obtains an F1 score of 65.4%, which is 23.4 points behind human performance (88.8%), indicating there is ample room for improvement. We launch CoQA as a challenge to the community at http://stanfordnlp.github.io/coqa/

Code Repositories

stanfordnlp/coqa-baselines
pytorch
Mentioned in GitHub
iit-nlp-research/chatgpt-crawler
pytorch
Mentioned in GitHub
leozhoujf/DataSciComp
paddle
Mentioned in GitHub
mrzjy/sunburst
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
generative-question-answering-on-coqaPGNet
F1-Score: 45.4
question-answering-on-coqaDrQA + seq2seq with copy attention (single model)
In-domain: 67.0
Out-of-domain: 60.4
Overall: 65.1
question-answering-on-coqaVanilla DrQA (single model)
In-domain: 54.5
Out-of-domain: 47.9
Overall: 52.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp