HyperAI

Natural Language Inference On Rte

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Model Name
Accuracy
Paper TitleRepository
LTG-BERT-small 24M53.7Not all layers are equally as important: Every Layer Counts BERT-
DistilBERT 66M62.9%DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Flipped-3B71.05Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
ERNIE68.8%ERNIE: Enhanced Language Representation with Informative Entities
T5-XXL 11B92.5%SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
ALBERT89.2%ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
GPT-NeoX 20B (1-shot)53.8%BloombergGPT: A Large Language Model for Finance-
PaLM 540B (1-shot)78.7%PaLM: Scaling Language Modeling with Pathways
KiC-770M74.00Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models-
data2vec69.9%data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
LaMini-F-T5 783M65%LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
PaLM 540B (0-shot)72.9%PaLM: Scaling Language Modeling with Pathways
Q-BERT (Shen et al., 2020)84.7Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT-
PaLM 2-L (1-shot)79.3%PaLM 2 Technical Report
RoE-3B64.01Exploring the Benefits of Training Expert Language Models over Instruction Tuning
SMART-BERT71.2%SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
RealFormer73.7%RealFormer: Transformer Likes Residual Attention
PaLM 2-S (1-shot)78.7%PaLM 2 Technical Report
XLNet (single model)85.9%XLNet: Generalized Autoregressive Pretraining for Language Understanding
BigBird75.0%Big Bird: Transformers for Longer Sequences
0 of 90 row(s) selected.