HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Genta Indra Winata Zhaojiang Lin Jamin Shin Zihan Liu Pascale Fung

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Abstract

In countries that speak multiple main languages, mixing up different languages within a conversation is commonly called code-switching. Previous works addressing this challenge mainly focused on word-level aspects such as word embeddings. However, in many cases, languages share common subwords, especially for closely related languages, but also for languages that are seemingly irrelevant. Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations. On the task of Named Entity Recognition for English-Spanish code-switching data, our model achieves the state-of-the-art performance in the multilingual settings. We also show that, in cross-lingual settings, our model not only leverages closely related languages, but also learns from languages with different roots. Finally, we show that combining different subunits are crucial for capturing code-switching entities.

Code Repositories

gentaiscool/meta-emb
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
named-entity-recognition-on-code-switchingHME (word + BPE + char)
F1: 69.17

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp