HyperAI超神经

Question Answering On Boolq

评估指标

Accuracy

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy
Paper TitleRepository
Mistral-Nemo 12B (HPT)99.87Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles-
FLAN 137B (4-shot)84.6Finetuned Language Models Are Zero-Shot Learners
LLaMA 2 13B (0-shot)81.7Llama 2: Open Foundation and Fine-Tuned Chat Models
Hybrid H3 125M (0-shot, logit scoring)59.6Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Neo-6B (QA)64.9Ask Me Anything: A simple strategy for prompting language models
UL2 20B (0-shot)63.1UL2: Unifying Language Learning Paradigms
Hybrid H3 2.7B (3-shot, logit scoring)60.6Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Bloomberg GPT 50B (1-shot)74.6BloombergGPT: A Large Language Model for Finance-
Gemma-7B99.419Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles-
GPT-1 117M (fine-tuned)72.87BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
LLaMA 2 34B (0-shot)83.7Llama 2: Open Foundation and Fine-Tuned Chat Models
Hybrid H3 1.3B (0-shot, logit scoring)61.7Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Hyena51.8Hyena Hierarchy: Towards Larger Convolutional Language Models
LLaMA-2 7B + MixLoRA72.7MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
Gopher (zero-shot)79.3Scaling Language Models: Methods, Analysis & Insights from Training Gopher
PaLM 2-S (1-shot)88.1PaLM 2 Technical Report
Vega v2 6B (fine-tuned)90.5Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
OPT-IML 175B71.4OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
LLaMA-3 8B + MixLoRA75MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts
OPT-IML 1.3B (0-shot)61.5OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
0 of 66 row(s) selected.