HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
自然语言推理
Natural Language Inference On Anli Test
Natural Language Inference On Anli Test
评估指标
A1
A2
A3
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
A1
A2
A3
Paper Title
Repository
T5-3B (explanation prompting)
81.8
72.5
74.8
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
-
PaLM 540B (Self Improvement, Self Consistency)
-
66.5
67.9
Large Language Models Can Self-Improve
-
PaLM 540B (Self Improvement, CoT Prompting)
-
65.3
67.3
Large Language Models Can Self-Improve
-
PaLM 540B (Self Improvement, Standard-Prompting)
-
64.8
66.9
Large Language Models Can Self-Improve
-
PaLM 540B (Self Consistency)
-
64.5
63.4
Large Language Models Can Self-Improve
-
PaLM 2-L (one-shot)
73.1
63.4
67.1
PaLM 2 Technical Report
T0-11B (explanation prompting)
75.6
60.6
59.9
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
-
PaLM 540B (CoT Prompting)
-
58.9
60.6
Large Language Models Can Self-Improve
-
PaLM 540B (Standard-Prompting)
-
55.8
55.8
Large Language Models Can Self-Improve
-
ChatGPT
62.3
52.6
54.1
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
ALUM (RoBERTa-LARGE)
72.3
52.1
48.4
Adversarial Training for Large Neural Language Models
XLNet (Large)
70.3
50.9
49.4
XLNet: Generalized Autoregressive Pretraining for Language Understanding
InfoBERT (RoBERTa)
75
50.5
47.7
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
RoBERTa (Large)
72.4
49.8
44.4
RoBERTa: A Robustly Optimized BERT Pretraining Approach
PaLM 2-M (one-shot)
58.1
49.5
54.5
PaLM 2 Technical Report
PaLM 2-S (one-shot)
53.1
48.8
53.2
PaLM 2 Technical Report
T0-3B (CoT fine-tuned)
41.7
37.2
41.9
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Flipped-3B
39.99
37.05
37.73
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
KiC-770M
36.30
35.00
37.60
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
-
RoE-3B
35.49
34.64
31.22
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
0 of 25 row(s) selected.
Previous
Next