HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

BioELECTRA:Pretrained Biomedical text Encoder using Discriminators

{Malaikannan Sankarasubbu Bhuvana Kundumani Kamal raj Kanakarajan}

BioELECTRA:Pretrained Biomedical text Encoder using Discriminators

Abstract

Recent advancements in pretraining strategies in NLP have shown a significant improvement in the performance of models on various text mining tasks. We apply ‘replaced token detection’ pretraining technique proposed by ELECTRA and pretrain a biomedical language model from scratch using biomedical text and vocabulary. We introduce BioELECTRA, a biomedical domain-specific language encoder model that adapts ELECTRA for the Biomedical domain. WE evaluate our model on the BLURB and BLUE biomedical NLP benchmarks. BioELECTRA outperforms the previous models and achieves state of the art (SOTA) on all the 13 datasets in BLURB benchmark and on all the 4 Clinical datasets from BLUE Benchmark across 7 different NLP tasks. BioELECTRA pretrained on PubMed and PMC full text articles performs very well on Clinical datasets as well. BioELECTRA achieves new SOTA 86.34%(1.39% accuracy improvement) on MedNLI and 64% (2.98% accuracy improvement) on PubMedQA dataset.

Benchmarks

BenchmarkMethodologyMetrics
medical-named-entity-recognition-on-shareBioELECTRA
F1: 0.8371
natural-language-inference-on-mednliBioELECTRA-Base
Accuracy: 86.34
Params (M): 110
question-answering-on-pubmedqaBioELECTRA uncased
Accuracy: 64.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
BioELECTRA:Pretrained Biomedical text Encoder using Discriminators | Papers | HyperAI