Open Domain Question Answering On Kilt Eli5
Metrics
F1
KILT-F1
KILT-RL
R-Prec
ROUGE-L
Recall@5
Results
Performance results of various models on this benchmark
Model Name | F1 | KILT-F1 | KILT-RL | R-Prec | ROUGE-L | Recall@5 | Paper Title | Repository |
---|---|---|---|---|---|---|---|---|
T5-base | 16.1 | 0.0 | 0.0 | 0.0 | 19.08 | 0.0 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
GENRE | 0.0 | 0.0 | 0.0 | 15.83 | 0.0 | 25.49 | - | - |
chriskuei | 0.0 | 0.0 | 0.0 | 17.5 | 0.0 | 25.54 | - | - |
Wikipedia | 15.91 | 2.38 | 2.46 | 14.83 | 16.45 | 27.69 | - | - |
RAG | 14.51 | 1.79 | 1.69 | 11.0 | 14.05 | 22.92 | - | - |
arxiv.org/abs/2103.06332 | 22.88 | 2.34 | 2.36 | 10.67 | 23.19 | 24.56 | Hurdles to Progress in Long-form Question Answering | |
BART | 19.23 | 0.0 | 0.0 | 0.0 | 20.55 | 0.0 | - | - |
Training Set Retrieval (top 1) | 21.62 | 0.0 | 0.0 | 0.0 | 18.66 | 0.0 | - | - |
Sphere | 15.29 | 0.0 | 0.0 | 0.0 | 15.76 | 0.0 | - | - |
somebody | 27.13 | 3.0 | 2.62 | 10.83 | 24.53 | 27.25 | - | - |
TABi | 0.0 | 0.0 | 0.0 | 18.33 | 0.0 | 28.21 | - | - |
multi-task small | 16.4 | 0.0 | 0.0 | 0.0 | 17.67 | 0.0 | - | - |
BART + DPR | 17.88 | 2.01 | 1.9 | 10.67 | 17.41 | 26.92 | - | - |
Input Copying | 14.8 | 0.0 | 0.0 | 0.0 | 16.88 | 0.0 | - | - |
Random Training Set Answer | 17.07 | 0.0 | 0.0 | 0.0 | 15.45 | 0.0 | - | - |
Multi-task DPR | 0.0 | 0.0 | 0.0 | 15.5 | 0.0 | 27.51 | - | - |
0 of 16 row(s) selected.