Command Palette
Search for a command to run...
ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting
Yuliang Liu Chunhua Shen Lianwen Jin Tong He Peng Chen Chongyu Liu Hao Chen

Abstract
End-to-end text-spotting, which aims to integrate detection and recognition in a unified framework, has attracted increasing attention due to its simplicity of the two complimentary tasks. It remains an open problem especially when processing arbitrarily-shaped text instances. Previous methods can be roughly categorized into two groups: character-based and segmentation-based, which often require character-level annotations and/or complex post-processing due to the unstructured output. Here, we tackle end-to-end text spotting by presenting Adaptive Bezier Curve Network v2 (ABCNet v2). Our main contributions are four-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve, which, compared with segmentation-based methods, can not only provide structured output but also controllable representation. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance of arbitrary shapes, significantly improving the precision of recognition over previous methods. 3) Different from previous methods, which often suffer from complex post-processing and sensitive hyper-parameters, our ABCNet v2 maintains a simple pipeline with the only post-processing non-maximum suppression (NMS). 4) As the performance of text recognition closely depends on feature alignment, ABCNet v2 further adopts a simple yet effective coordinate convolution to encode the position of the convolutional filters, which leads to a considerable improvement with negligible computation overhead. Comprehensive experiments conducted on various bilingual (English and Chinese) benchmark datasets demonstrate that ABCNet v2 can achieve state-of-the-art performance while maintaining very high efficiency.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| text-spotting-on-icdar-2015 | ABCNet v2 | F-measure (%) - Generic Lexicon: 73.0 F-measure (%) - Strong Lexicon: 82.7 F-measure (%) - Weak Lexicon: 78.5 |
| text-spotting-on-inverse-text | ABCNet v2 | F-measure (%) - Full Lexicon: 47.4 F-measure (%) - No Lexicon: 34.5 |
| text-spotting-on-scut-ctw1500 | ABCNet v2 | F-Measure (%) - Full Lexicon: 77.2 F-measure (%) - No Lexicon: 57.5 |
| text-spotting-on-total-text | ABCNet v2 | F-measure (%) - Full Lexicon: 78.1 F-measure (%) - No Lexicon: 70.4 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.