HyperAI超神经

Natural Language Inference On Rcb

评估指标

Accuracy
Average F1

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy
Average F1
Paper TitleRepository
RuGPT3XL few-shot0.4180.302--
ruRoberta-large finetune0.5180.357--
Golden Transformer0.5460.406--
RuBERT plain0.4630.367--
ruT5-large-finetune0.4980.306--
Human Benchmark0.7020.68RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
ruBert-base finetune0.5090.333--
RuGPT3Large 0.4840.417--
RuGPT3Small0.4730.356--
YaLM 1.0B few-shot0.4470.408--
SBERT_Large0.4520.371--
Multilingual Bert0.4450.367--
MT5 Large0.4540.366mT5: A massively multilingual pre-trained text-to-text transformer
ruBert-large finetune0.50.356--
SBERT_Large_mt_ru_finetuning0.4860.351--
ruT5-base-finetune0.4680.307--
heuristic majority0.4380.4Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
Random weighted0.3740.319Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks-
RuGPT3Medium0.4610.372--
RuBERT conversational0.4840.452--
0 of 22 row(s) selected.