Question Answering
Question Answering is an important task in the field of natural language processing, aimed at automatically answering questions posed by users through computer systems. This task can be subdivided into subtasks such as community question answering and knowledge base question answering, with evaluation metrics primarily including EM (Exact Match) and F1 scores. Currently, popular benchmark datasets include SQuAD, HotPotQA, bAbI, TriviaQA, and WikiQA. In recent years, models like T5 and XLNet have performed exceptionally well in this area, advancing the accuracy and practicality of question answering systems.
PAAG
Cardal
KGT5
STM
BioLinkBERT (large)
BioLinkBERT (large)
Gemma-7B
Custom Legal-BERT
Fast Weight Memory
Fast Weight Memory
MuCoT
NSE
Gated-Attention Reader
G-DAUG-Combo + RoBERTa-Large
SubGTR
WebQA
TOME-2
FiD
PaLM 540B (finetuned)
GPT-3 175B (few-shot, k=32)
QDGAT (ensemble)
Vector Database (ChromaDB)
BART fine-tuned on FairytaleQA
ELASTIC (RoBERTa-large)
GeoQA2
ChatGPT
Beam Retrieval
BM25+CE
MAFiD
BERT-Japanese
Claude-3.5-Sonnet (ReAct)
TP-Transformer
syntax, frame, coreference, and word embedding features
MedMobile (3.8B)
DRAGON + BioLinkBERT
T5-small+prolog
RGX
PaLM 540B (finetuned)
RoBERTa-large Tagger + LIQUID (Ensemble)
Masque (NarrativeQA + MS MARCO)
Atlas (full, Wiki-dec-2018 index)
DensePhrases
DPR
OpenAI/o3-mini-2025-01-31-high
FLAN 137B (zero-shot)
Fusion Retriever+ETC
GPT-4o-2024-08-06-128k
LLaMA 65B (0-shot)
SelfRAG-7b
BioMedGPT-10B
BioGPT-Large(1.5B)
Attentive LSTM
Longformer Encoder Decoder (base)
FlowQA (single model)
DeBERTa (large)
multimodal+LXMERT+ConstrainedMaxPooling
XLNet-large
HyperQA
LLaMA 65B (zero-shot)
CREMA
LUKE
T5-11B
XLNet (single model)
TP-MANN
Neo-6B (QA + WS)
BLOOMZ
PaLM 2 (few-shot, CoT, SC)
DeBERTaV3large
TagOp
QAap
ECONET
TANDA DeBERTa-V3-Large + ALL
PaLM 2-L (one-shot)
CoA
ByT5
Bing Chat
FiE+PAQ
ChatGPT
BigBird-etc
TANDA-RoBERTa (ASNQ, WikiQA)
TabSQLify (col+row)
sMIM (1024) +