Command Palette
Search for a command to run...
Bilingual Rhetorical Structure Parsing with Large Parallel Annotations
Elena Chistova

Abstract
Discourse parsing is a crucial task in natural language processing that aims to reveal the higher-level relations in a text. Despite growing interest in cross-lingual discourse parsing, challenges persist due to limited parallel data and inconsistencies in the Rhetorical Structure Theory (RST) application across languages and corpora. To address this, we introduce a parallel Russian annotation for the large and diverse English GUM RST corpus. Leveraging recent advances, our end-to-end RST parser achieves state-of-the-art results on both English and Russian corpora. It demonstrates effectiveness in both monolingual and bilingual settings, successfully transferring even with limited second-language annotation. To the best of our knowledge, this work is the first to evaluate the potential of cross-lingual end-to-end RST parsing on a manually annotated parallel corpus.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| discourse-parsing-on-rst-dt | DMRST | Standard Parseval (Full): 55.7 ± 0.3 Standard Parseval (Nuclearity): 68.0 ± 0.6 Standard Parseval (Relation): 57.3 ± 0.2 Standard Parseval (Span): 78.7 ± 0.4 |
| end-to-end-rst-parsing-on-rst-dt-1 | DMRST + ToNy + E-BiLSTM | Standard Parseval (Full): 53.0 ± 0.7 Standard Parseval (Nuclearity): 64.5 ± 0.8 Standard Parseval (Relation): 54.5 ± 0.7 Standard Parseval (Span): 74.8 ± 0.5 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.