HyperAI超神经

Open Domain Question Answering On Kilt

评估指标

EM
F1
KILT-EM
KILT-F1
R-Prec
Recall@5

评测结果

各个模型在此基准测试上的表现结果

模型名称
EM
F1
KILT-EM
KILT-F1
R-Prec
Recall@5
Paper TitleRepository
Multitask DPR + BART39.7548.4329.0934.759.4268.24--
Sphere46.0556.570.00.00.00.0--
KGI_045.2253.3836.3641.8363.7170.17--
intersect53.7462.2438.7844.463.1668.19--
Multi-task DPR0.00.00.00.059.4268.24--
BART + DPR41.2749.5430.0634.7254.2965.52--
BERT + DPR38.6447.0931.9937.5860.6646.79--
multi-task small0.353.720.00.00.00.0--
Re2G51.7360.9743.5649.870.7876.63Re2G: Retrieve, Rerank, Generate
BART21.7528.690.00.00.00.0--
Wikipedia51.5960.8335.3240.7359.8371.17--
TABi0.00.00.00.062.664.95--
T5-base19.627.730.00.00.00.0KILT: a Benchmark for Knowledge Intensive Language Tasks
chriskuei0.00.00.00.060.3261.21--
GENRE0.00.00.00.060.2561.36--
RAG44.3952.3532.6937.9159.4967.06--
0 of 16 row(s) selected.