HyperAI

Natural Language Inference On Commitmentbank

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Model Name
Accuracy
Paper TitleRepository
ST-MoE-32B 269B (fine-tuned)98ST-MoE: Designing Stable and Transferable Sparse Expert Models-
OPT 66B (one-shot)44.64BloombergGPT: A Large Language Model for Finance-
GPT-NeoX (one-shot)48.21BloombergGPT: A Large Language Model for Finance-
ST-MoE-L 4.1B (fine-tuned)98.2ST-MoE: Designing Stable and Transferable Sparse Expert Models-
N-Grammer 343M67.9 N-Grammer: Augmenting Transformers with latent n-grams
GPT-3 175B (Few-Shot)75.6Language Models are Few-Shot Learners
PaLM 2-S (one-shot)82.1PaLM 2 Technical Report
BLOOM 176B (one-shot)48.21BloombergGPT: A Large Language Model for Finance-
PaLM 2-M (one-shot)80.4PaLM 2 Technical Report
GPT-3 175B (few-shot, k=32)-Language Models are Few-Shot Learners
PaLM 2-L (one-shot)87.5PaLM 2 Technical Report
T5-XXL 11B (fine-tuned)96.8Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Bloomberg GPT (one-shot)53.57BloombergGPT: A Large Language Model for Finance-
T5-Large 770M (fine-tuned)94.4Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
PaLM 540B (finetuned)100PaLM: Scaling Language Modeling with Pathways
Turing NLR v5 XXL 5.4B (fine-tuned)97.6Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
Vega v2 6B (KD-based prompt transfer)99.2Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE-
AlexaTM 20B67.9AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
DeBERTa-1.5B97.2DeBERTa: Decoding-enhanced BERT with Disentangled Attention
T5-Base 220M (fine-tuned)94Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
0 of 20 row(s) selected.