HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

Minh Van Nguyen; Viet Dac Lai; Amir Pouran Ben Veyseh; Thien Huu Nguyen

Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing

Abstract

We introduce Trankit, a light-weight Transformer-based Toolkit for multilingual Natural Language Processing (NLP). It provides a trainable pipeline for fundamental NLP tasks over 100 languages, and 90 pretrained pipelines for 56 languages. Built on a state-of-the-art pretrained language model, Trankit significantly outperforms prior multilingual NLP pipelines over sentence segmentation, part-of-speech tagging, morphological feature tagging, and dependency parsing while maintaining competitive performance for tokenization, multi-word token expansion, and lemmatization over 90 Universal Dependencies treebanks. Despite the use of a large pretrained transformer, our toolkit is still efficient in memory usage and speed. This is achieved by our novel plug-and-play mechanism with Adapters where a multilingual pretrained transformer is shared across pipelines for different languages. Our toolkit along with pretrained models and code are publicly available at: https://github.com/nlp-uoregon/trankit. A demo website for our toolkit is also available at: http://nlp.uoregon.edu/trankit. Finally, we create a demo video for Trankit at: https://youtu.be/q0KGP3zGjGc.

Code Repositories

nlp-uoregon/trankit
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
dependency-parsing-on-ud2-5-testTrankit
Macro-averaged F1: 87.06
dependency-parsing-on-ud2-5-testStanza
Macro-averaged F1: 83.06
part-of-speech-tagging-on-ud2-5-testTrankit
Macro-averaged F1: 95.65
part-of-speech-tagging-on-ud2-5-testStanza
Macro-averaged F1: 94.21

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp