Question Answering On Squad11 Dev

Metrics

Results

Performance results of various models on this benchmark

			Paper Title
XLNet+DSC	89.79	95.77	Dice Loss for Data-imbalanced NLP Tasks
T5-11B	90.06	95.64	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
XLNet (single model)	89.7	95.1	XLNet: Generalized Autoregressive Pretraining for Language Understanding
LUKE 483M	-	95	LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
T5-3B	88.53	94.95	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
T5-Large 770M	86.66	93.79	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BERT-LARGE (Ensemble+TriviaQA)	86.2	92.2	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
T5-Base	85.44	92.08	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
BERT-LARGE (Single+TriviaQA)	84.2	91.1	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BART Base (with text infilling)	-	90.8	BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
BERT large (LAMB optimizer)	-	90.584	Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
BERT-Large-uncased-PruneOFA (90% unstruct sparse)	83.35	90.2	Prune Once for All: Sparse Pre-Trained Language Models
BERT-Large-uncased-PruneOFA (90% unstruct sparse, QAT Int8)	83.22	90.02	Prune Once for All: Sparse Pre-Trained Language Models
BERT-Base-uncased-PruneOFA (85% unstruct sparse)	81.1	88.42	Prune Once for All: Sparse Pre-Trained Language Models
BERT-Base-uncased-PruneOFA (85% unstruct sparse, QAT Int8)	80.84	88.24	Prune Once for All: Sparse Pre-Trained Language Models
TinyBERT-6 67M	79.7	87.5	TinyBERT: Distilling BERT for Natural Language Understanding
BERT-Base-uncased-PruneOFA (90% unstruct sparse)	79.83	87.25	Prune Once for All: Sparse Pre-Trained Language Models
T5-Small	79.1	87.24	Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
R.M-Reader (single)	78.9	86.3	Reinforced Mnemonic Reader for Machine Reading Comprehension
DensePhrases	78.3	86.3	Learning Dense Representations of Phrases at Scale

0 of 55 row(s) selected.

Command Palette

Question Answering On Squad11 Dev

Metrics

Results