Question Answering On Kilt Eli5
评估指标
F1
Rouge-L
评测结果
各个模型在此基准测试上的表现结果
模型名称 | F1 | Rouge-L | Paper Title | Repository |
---|---|---|---|---|
BART+DPR | 17.88 | 17.41 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
T5-base | 16.1 | 19.08 | KILT: a Benchmark for Knowledge Intensive Language Tasks | |
EMAT | 19.03 | 20.91 | An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks | |
RBG | 24.53 | 27.13 | Read before Generate! Faithful Long Form Question Answering with Machine Reading | - |
KID | - | 26.3 | Knowledge Infused Decoding | |
c-REALM | 23.1 | 23.4 | Hurdles to Progress in Long-form Question Answering | |
RAG | 14.51 | 14.05 | KILT: a Benchmark for Knowledge Intensive Language Tasks |
0 of 7 row(s) selected.