HyperAI超神经

Math Word Problem Solving On Svamp 1 N

评估指标

Execution Accuracy

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Execution Accuracy
athena-mathematical-reasoning-with-thought67.8
athena-mathematical-reasoning-with-thought52.5