Question Answering On Openbookqa
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | Accuracy |
---|---|
unifiedqa-crossing-format-boundaries-with-a | 87.2 |
lamini-lm-a-diverse-herd-of-distilled-models | 31.2 |
clues-before-answers-generation-enhanced | 89.8 |
bloomberggpt-a-large-language-model-for | 47.2 |
模型 5 | 95.9 |
large-language-models-can-self-improve | 84.4 |
large-language-models-can-self-improve | 86.4 |
grapeqa-graph-augmentation-and-pruning-to | 90 |
mixlora-enhancing-large-language-models-fine | 81.6 |
mixture-of-subspaces-in-low-rank-adaptation | 86.8 |
can-a-suit-of-armor-conduct-electricity-a-new | 56.3 |
模型 12 | 95.2 |
large-language-models-can-self-improve | 94.4 |
language-models-are-few-shot-learners | 65.4 |
grapeqa-graph-augmentation-and-pruning-to | 82 |
lamini-lm-a-diverse-herd-of-distilled-models | 34 |
模型 17 | 87.6 |
grapeqa-graph-augmentation-and-pruning-to | 66.2 |
lamini-lm-a-diverse-herd-of-distilled-models | 36 |
mixlora-enhancing-large-language-models-fine | 84.8 |
bloomberggpt-a-large-language-model-for | 51.6 |
mixlora-enhancing-large-language-models-fine | 83 |
palm-2-technical-report-1 | 58.5 |
模型 24 | 91.3 |
can-a-suit-of-armor-conduct-electricity-a-new | 55.8 |
bloomberggpt-a-large-language-model-for | 44.2 |
large-language-models-can-self-improve | 93 |
bloomberggpt-a-large-language-model-for | 58.0 |
lamini-lm-a-diverse-herd-of-distilled-models | 32.8 |
large-language-models-can-self-improve | 92 |
lamini-lm-a-diverse-herd-of-distilled-models | 32 |
can-a-suit-of-armor-conduct-electricity-a-new | 25 |
fusing-context-into-knowledge-graph-for | 83.2 |
careful-selection-of-knowledge-to-solve-open | 72 |
lamini-lm-a-diverse-herd-of-distilled-models | 39.8 |
qa-gnn-reasoning-with-language-models-and | 77.8 |
qa-gnn-reasoning-with-language-models-and | 82.8 |
fusing-context-into-knowledge-graph-for | 82.4 |
qa-gnn-reasoning-with-language-models-and | 82.8 |
palm-2-technical-report-1 | 56.2 |
palm-2-technical-report-1 | 57.4 |
模型 42 | 94.2 |
large-language-models-can-self-improve | 90 |
can-a-suit-of-armor-conduct-electricity-a-new | 76.9 |
gnn-is-a-counter-revisiting-gnn-for-question | 87.4 |