HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Nils Reimers; Iryna Gurevych

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Abstract

BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (~65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.

Code Repositories

aneesha/SiameseBERT-Notebook
Mentioned in GitHub
datcancode/sentence-transformers
pytorch
Mentioned in GitHub
BM-K/KoSentenceBERT_ETRI
pytorch
Mentioned in GitHub
sjtu-lit/syncse
pytorch
Mentioned in GitHub
reoneo97/wutr-buildon-2021
pytorch
Mentioned in GitHub
BM-K/KoSentenceBERT
pytorch
Mentioned in GitHub
princeton-nlp/SimCSE
pytorch
Mentioned in GitHub
oto-labs/librarian
Mentioned in GitHub
kihohan/NLP_Reference
pytorch
Mentioned in GitHub
zhihaillm/wisdominterrogatory
pytorch
Mentioned in GitHub
jcyk/mse-amr
pytorch
Mentioned in GitHub
yjiangcm/dcpcse
pytorch
Mentioned in GitHub
croitorualin/reverse-stable-diffusion
pytorch
Mentioned in GitHub
rmslick/SummarySearch
pytorch
Mentioned in GitHub
eric11eca/NeuralLog
Mentioned in GitHub
gmcgoldr/theissues
pytorch
Mentioned in GitHub
brightjade/CS492E-CiteRec
pytorch
Mentioned in GitHub
Siamul/NLP-Project
pytorch
Mentioned in GitHub
lambert-x/prolab
pytorch
Mentioned in GitHub
idiap/analogy_learning
pytorch
Mentioned in GitHub
dmmiller612/bert-extractive-summarizer
pytorch
Mentioned in GitHub
puerrrr/focal-infonce
pytorch
Mentioned in GitHub
yjiangcm/promcse
pytorch
Mentioned in GitHub
hkust-nlp/syncse
pytorch
Mentioned in GitHub
nuochenpku/sscl
pytorch
Mentioned in GitHub
BinWang28/SBERT-WK-Sentence-Embedding
pytorch
Mentioned in GitHub
Danqi7/584-final
pytorch
Mentioned in GitHub
BinWang28/BERT_Sentence_Embedding
pytorch
Mentioned in GitHub
law-ai/summarization
pytorch
Mentioned in GitHub
UKPLab/sentence-transformers
Official
pytorch
Mentioned in GitHub
BM-K/KoSentenceBERT_SKT
pytorch
Mentioned in GitHub
valdecy/pybibx
tf
Mentioned in GitHub
BM-K/KoSentenceBERT_SKTBERT
pytorch
Mentioned in GitHub
hhzrd/BEFAQ
Mentioned in GitHub
yur7nd/ptss
pytorch
Mentioned in GitHub
TheNeuromancer/SentEmb
pytorch
Mentioned in GitHub
eelenadelolmo/WordVectors
pytorch
Mentioned in GitHub
bm-k/kosentencebert-skt
pytorch
Mentioned in GitHub
martinomensio/spacy-sentence-bert
pytorch
Mentioned in GitHub
InsaneLife/dssm
tf
Mentioned in GitHub
xiaoouwang/frenchnlp
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
semantic-textual-similarity-on-sickSRoBERTa-NLI-large
Spearman Correlation: 0.7429
semantic-textual-similarity-on-sickSRoBERTa-NLI-base
Spearman Correlation: 0.7446
semantic-textual-similarity-on-sickSBERT-NLI-base
Spearman Correlation: 0.7291
semantic-textual-similarity-on-sickSBERT-NLI-large
Spearman Correlation: 0.7375
semantic-textual-similarity-on-sickSentenceBERT
Spearman Correlation: 0.7462
semantic-textual-similarity-on-sts-benchmarkSRoBERTa-NLI-STSb-large
Spearman Correlation: 0.8615
semantic-textual-similarity-on-sts-benchmarkSBERT-NLI-base
Spearman Correlation: 0.7703
semantic-textual-similarity-on-sts-benchmarkSRoBERTa-NLI-base
Spearman Correlation: 0.7777
semantic-textual-similarity-on-sts-benchmarkSBERT-NLI-large
Spearman Correlation: 0.79
semantic-textual-similarity-on-sts-benchmarkSBERT-STSb-base
Spearman Correlation: 0.8479
semantic-textual-similarity-on-sts-benchmarkSBERT-STSb-large
Spearman Correlation: 0.8445
semantic-textual-similarity-on-sts12SRoBERTa-NLI-large
Spearman Correlation: 0.7453
semantic-textual-similarity-on-sts13SBERT-NLI-large
Spearman Correlation: 0.7846
semantic-textual-similarity-on-sts14SBERT-NLI-large
Spearman Correlation: 0.7490000000000001
semantic-textual-similarity-on-sts15SRoBERTa-NLI-large
Spearman Correlation: 0.8185
semantic-textual-similarity-on-sts16SRoBERTa-NLI-large
Spearman Correlation: 0.7682

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp