5 months ago

Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

Samuel Humeau; Kurt Shuster; Marie-Anne Lachaux; Jason Weston

Abstract

The use of deep pre-trained bidirectional transformers has led to remarkable progress in a number of applications (Devlin et al., 2018). For tasks that make pairwise comparisons between sequences, matching a given input with a corresponding label, two approaches are common: Cross-encoders performing full self-attention over the pair and Bi-encoders encoding the pair separately. The former often performs better, but is too slow for practical use. In this work, we develop a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features. We perform a detailed comparison of all three approaches, including what pre-training and fine-tuning strategies work best. We show our models achieve state-of-the-art results on three existing tasks; that Poly-encoders are faster than Cross-encoders and more accurate than Bi-encoders; and that the best results are obtained by pre-training on large datasets similar to the downstream tasks.

Code Repositories

fangrouli/Document-embedding-generation-models

pytorch

Mentioned in GitHub

sfzhou5678/PolyEncoder

pytorch

Mentioned in GitHub

Alexey-Borisov/3_course_diary

Mentioned in GitHub

i2r-simmc/i2r-simmc-2020

pytorch

Mentioned in GitHub

llStringll/Poly-encoders

pytorch

Mentioned in GitHub

csong27/collision-bert

pytorch

Mentioned in GitHub

chijames/Poly-Encoder

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
conversational-response-selection-on-douban-1	Poly-encoder	MAP: 0.608 MRR: 0.650 P@1: 0.475 R10@1: 0.299 R10@2: 0.494 R10@5: 0.822
conversational-response-selection-on-dstc7	Bi-encoder	1-of-100 Accuracy: 66.3%
conversational-response-selection-on-dstc7	Bi-encoder (v2)	1-of-100 Accuracy: 70.9%
conversational-response-selection-on-rrs-1	Poly-encoder	NDCG@3: 0.679 NDCG@5: 0.765
conversational-response-selection-on-ubuntu-1	Poly-encoder	R10@1: 0.882 R10@2: 0.949 R10@5: 0.990

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

Samuel Humeau; Kurt Shuster; Marie-Anne Lachaux; Jason Weston

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters