HyperAI超神经

Multi Task Language Understanding On Mgsm

评估指标

Average (%)

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Average (%)
transcending-scaling-laws-with-0-1-extra49.9
palm-scaling-language-modeling-with-pathways-155.0
palm-2-technical-report-187.0
scaling-instruction-finetuned-language-models60.4
scaling-instruction-finetuned-language-models72.0
scaling-instruction-finetuned-language-models35
scaling-instruction-finetuned-language-models57.0
scaling-instruction-finetuned-language-models5.7
scaling-instruction-finetuned-language-models36
scaling-instruction-finetuned-language-models21.2
scaling-instruction-finetuned-language-models23.7
palm-2-technical-report-172.2