HyperAI超神经

Question Answering On Piqa

评估指标

Accuracy

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy
Paper TitleRepository
Open-LLaMA-3B-v276.2Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA 33B (0-shot)82.3LLaMA: Open and Efficient Foundation Language Models
DeBERTa-Large 304M (classification-based)85.9Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering
OPT 66B (1-shot)77.6BloombergGPT: A Large Language Model for Finance-
LLaMA 2 13B (0-shot)80.5Llama 2: Open Foundation and Fine-Tuned Chat Models
LLaMA 2 34B (0-shot)81.9Llama 2: Open Foundation and Fine-Tuned Chat Models
UnifiedQA 3B85.3UnifiedQA: Crossing Format Boundaries With a Single QA System
ExDeBERTa 567M85.5Task Compass: Scaling Multi-task Pre-training with Task Prefix
GPT-2-XL 1.5B70.5LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
PaLM 2-M (1-shot)83.2PaLM 2 Technical Report
Sheared-LLaMA-2.7B75.8Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA-3 8B + MixLoRA87.6MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
GPT-2-small 124M (fine-tuned)69.2PIQA: Reasoning about Physical Commonsense in Natural Language
LLaMA-2 7B + MixLoRA83.2MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
SparseGPT 175B (50% Sparsity)80.63SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
GPT-3 175B (0-shot)81.0Language Models are Few-Shot Learners
LLaMA3 8B+MoSLoRA89.7Mixture-of-Subspaces in Low-Rank Adaptation
Sheared-LLaMA-1.3B73.4Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLaMA 7B (0-shot)79.8LLaMA: Open and Efficient Foundation Language Models
LaMini-F-T5 783M70.6LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
0 of 67 row(s) selected.