Question Answering On Story Cloze
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Accuracy | Paper Title | Repository |
---|---|---|---|
GPT-3 175B (Few-Shot) | 87.7 | Language Models are Few-Shot Learners | |
Neo-6B (QA) | 76.3 | Ask Me Anything: A simple strategy for prompting language models | |
PaLM 2-M (one-shot) | 86.7 | PaLM 2 Technical Report | |
PaLM 2-S (one-shot) | 85.6 | PaLM 2 Technical Report | |
PaLM 2-L (one-shot) | 87.4 | PaLM 2 Technical Report | |
Neo-6B (QA + WS) | 87.8 | Ask Me Anything: A simple strategy for prompting language models | |
Neo-6B (few-shot) | 51.0 | Ask Me Anything: A simple strategy for prompting language models |
0 of 7 row(s) selected.