HyperAI超神经

Question Answering On Webquestions

评估指标

EM

评测结果

各个模型在此基准测试上的表现结果

模型名称
EM
Paper TitleRepository
PaLM-540B (Zero-Shot)10.6PaLM: Scaling Language Modeling with Pathways
GPT-3-175B (Few-Shot)41.5Language Models are Few-Shot Learners
PaLM 2-S (one-shot)21.8PaLM 2 Technical Report
ToT26.3Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
ToT26.3Tree of Thoughts: Deliberate Problem Solving with Large Language Models
React38.3ReAct: Synergizing Reasoning and Acting in Language Models
DSP59.4Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
REALM40.7REALM: Retrieval-Augmented Language Model Pre-Training
CoT42.5Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
GPT-3-175B (Zero-Shot)14.4Language Models are Few-Shot Learners
CoT42.5Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
Memory Networks (ensemble)-Large-scale Simple Question Answering with Memory Networks
PaLM 2-L (one-shot)28.2PaLM 2 Technical Report
T5.1.1-XXL+SSM42.8Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
PaLM-540B (One-Shot)22.6PaLM: Scaling Language Modeling with Pathways
GPT-3-175B (One-Shot)25.3Language Models are Few-Shot Learners
FiE+PAQ56.3FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering-
DPR42.4Dense Passage Retrieval for Open-Domain Question Answering
Self-Ask31.1Measuring and Narrowing the Compositionality Gap in Language Models
Subgraph embeddings-Question Answering with Subgraph Embeddings
0 of 37 row(s) selected.