HyperAIHyperAI

Question Answering On Piqa

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Model Name
Accuracy
Paper TitleRepository
Open-LLaMA-3B-v276.2Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning-
LLaMA 33B (0-shot)82.3LLaMA: Open and Efficient Foundation Language Models-
DeBERTa-Large 304M (classification-based)85.9Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering-
OPT 66B (1-shot)77.6BloombergGPT: A Large Language Model for Finance-
LLaMA 2 13B (0-shot)80.5Llama 2: Open Foundation and Fine-Tuned Chat Models-
LLaMA 2 34B (0-shot)81.9Llama 2: Open Foundation and Fine-Tuned Chat Models-
UnifiedQA 3B85.3UnifiedQA: Crossing Format Boundaries With a Single QA System-
ExDeBERTa 567M85.5Task Compass: Scaling Multi-task Pre-training with Task Prefix-
GPT-2-XL 1.5B70.5LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions-
PaLM 2-M (1-shot)83.2PaLM 2 Technical Report-
Sheared-LLaMA-2.7B75.8Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning-
LLaMA-3 8B + MixLoRA87.6MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts-
GPT-2-small 124M (fine-tuned)69.2PIQA: Reasoning about Physical Commonsense in Natural Language-
LLaMA-2 7B + MixLoRA83.2MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts-
SparseGPT 175B (50% Sparsity)80.63SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot-
GPT-3 175B (0-shot)81.0Language Models are Few-Shot Learners-
LLaMA3 8B+MoSLoRA89.7Mixture-of-Subspaces in Low-Rank Adaptation-
Sheared-LLaMA-1.3B73.4Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning-
LLaMA 7B (0-shot)79.8LLaMA: Open and Efficient Foundation Language Models-
LaMini-F-T5 783M70.6LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions-
0 of 67 row(s) selected.