HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

RealFormer: Transformer Likes Residual Attention

Ruining He Anirudh Ravula Bhargav Kanagal Joshua Ainslie

RealFormer: Transformer Likes Residual Attention

Abstract

Transformer is the backbone of modern NLP models. In this paper, we propose RealFormer, a simple and generic technique to create Residual Attention Layer Transformer networks that significantly outperform the canonical Transformer and its variants (BERT, ETC, etc.) on a wide spectrum of tasks including Masked Language Modeling, GLUE, SQuAD, Neural Machine Translation, WikiHop, HotpotQA, Natural Questions, and OpenKP. We also observe empirically that RealFormer stabilizes training and leads to models with sparser attention. Source code and pre-trained checkpoints for RealFormer can be found at https://github.com/google-research/google-research/tree/master/realformer.

Code Repositories

JunnYu/x-transformers-paddle
jax
Mentioned in GitHub
aivolcano/BERT_MRC_CLS
pytorch
Mentioned in GitHub
cloneofsimo/RealFormer-pytorch
pytorch
Mentioned in GitHub
jaketae/realformer
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
linguistic-acceptability-on-colaRealFormer
Accuracy: 59.83%
natural-language-inference-on-multinliRealFormer
Matched: 86.28
Mismatched: 86.34
natural-language-inference-on-qnliRealFormer
Accuracy: 91.89%
natural-language-inference-on-rteRealFormer
Accuracy: 73.7%
paraphrase-identification-on-quora-questionRealFormer
Accuracy: 91.34
F1: 88.28
semantic-textual-similarity-on-mrpcRealFormer
Accuracy: 87.01%
F1: 90.91%
semantic-textual-similarity-on-sts-benchmarkRealFormer
Pearson Correlation: 0.9011
Spearman Correlation: 0.8988
sentiment-analysis-on-sst-2-binaryRealFormer
Accuracy: 94.04

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp