HyperAI超神经

Question Answering On Hotpotqa

评估指标

ANS-EM
ANS-F1
JOINT-EM
JOINT-F1
SUP-EM
SUP-F1

评测结果

各个模型在此基准测试上的表现结果

模型名称
ANS-EM
ANS-F1
JOINT-EM
JOINT-F1
SUP-EM
SUP-F1
Paper TitleRepository
RoBERTa-DenseRetriever-Fast0.5980.7270.3450.6020.4800.749--
SAQA0.2840.3860.0860.2450.1470.472--
MultiQA0.3070.4020.0000.0000.0000.000--
Entity-centric IR0.3540.4630.0000.2550.0010.432--
GRN + BERT0.2990.3910.0830.2580.1320.497--
HopRetriever + Sp-search0.6710.7990.4320.7060.5740.835HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions-
SAFSr-Bert0.3940.5140.1330.3700.2420.585--
DPR-recurrent0.5980.7270.3450.6020.4800.749--
HGN Model-reproduce0.3350.4270.1100.2840.1560.493--
GoldEn Retriever0.3790.4860.1800.3910.3070.642Answering Complex Open-domain Questions Through Iterative Query Generation
HopRetriever-V10.6080.7390.3800.6390.5310.793--
HopAns0.6170.7460.3680.6290.5000.772--
GAR0.4820.6130.3060.5300.4830.739--
tes0.0740.1210.0000.0110.0000.078--
PR-Bert0.4330.5380.1450.3910.2190.596--
Chain-of-Skills0.6740.8010.4570.7170.6130.853Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Beam Retrieval0.7270.8500.5050.7750.6630.901End-to-End Beam Retrieval for Multi-Hop Question Answering
DR model0.5880.7170.2930.5680.4160.725--
AFSgraph0.6010.7300.3590.6170.5000.769--
HopRetriever0.6710.7990.4310.6980.5720.826--
0 of 72 row(s) selected.