Question Answering On Finqa
评估指标
Execution Accuracy
Program Accuracy
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Execution Accuracy | Program Accuracy | Paper Title | Repository |
---|---|---|---|---|
ELASTIC (RoBERTa-large) | 68.96 | 65.21 | ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler | |
FinQANet (BERT-large) | 57.43 | 55.52 | FinQA: A Dataset of Numerical Reasoning over Financial Data | |
FinQANet (RoBERTa-large) | 65.05 | 63.52 | FinQA: A Dataset of Numerical Reasoning over Financial Data | |
GPT-4 (8k) | 68.79 | - | Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks | - |
APOLLO | 71.07 | 68.94 | APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning | |
FinQANet (FinBert ) | 53.71 | 51.71 | FinQA: A Dataset of Numerical Reasoning over Financial Data |
0 of 6 row(s) selected.