HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition

{Pierre Zweigenbaum Olivier Ferret Hicham El Boukkouri Thomas Lavergne}

Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition

Abstract

Using pre-trained word embeddings in conjunction with Deep Learning models has become the {``}de facto{''} approach in Natural Language Processing (NLP). While this usually yields satisfactory results, off-the-shelf word embeddings tend to perform poorly on texts from specialized domains such as clinical reports. Moreover, training specialized word representations from scratch is often either impossible or ineffective due to the lack of large enough in-domain data. In this work, we focus on the clinical domain for which we study embedding strategies that rely on general-domain resources only. We show that by combining off-the-shelf contextual embeddings (ELMo) with static word2vec embeddings trained on a small in-domain corpus built from the task data, we manage to reach and sometimes outperform representations learned from a large corpus in the medical domain.

Benchmarks

BenchmarkMethodologyMetrics
clinical-concept-extraction-on-2010-i2b2vaELMo (finetuned on i2b2) + word2vec (i2b2)
Exact Span F1: 86.23

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition | Papers | HyperAI