HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Incorporating Word Attention into Character-Based Word Segmentation

{Shohei Higashiyama Masao Utiyama Yoshiaki Oida Yohei Sakamoto Masao Ideuchi Eiichiro Sumita Isaac Okada}

Incorporating Word Attention into Character-Based Word Segmentation

Abstract

Neural network models have been actively applied to word segmentation, especially Chinese, because of the ability to minimize the effort in feature engineering. Typical segmentation models are categorized as character-based, for conducting exact inference, or word-based, for utilizing word-level information. We propose a character-based model utilizing word information to leverage the advantages of both types of models. Our model learns the importance of multiple candidate words for a character on the basis of an attention mechanism, and makes use of it for segmentation decisions. The experimental results show that our model achieves better performance than the state-of-the-art models on both Japanese and Chinese benchmark datasets.

Benchmarks

BenchmarkMethodologyMetrics
japanese-word-segmentation-on-bccwjWord Attention
F1-score (Word): 0.9893

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Incorporating Word Attention into Character-Based Word Segmentation | Papers | HyperAI