Command Palette
Search for a command to run...
Xiang Zhang Yongwen Su Subarna Tripathi Zhuowen Tu

Abstract
In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heuristics-driven post-processing procedures; TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations. We show our canonical representation of control points suitable for text instances in both Bezier curve and polygon annotations. In addition, we design a bounding-box guided polygon detection (box-to-polygon) process. Experiments on curved and arbitrarily shaped datasets demonstrate state-of-the-art performances of the proposed TESTR algorithm.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| text-spotting-on-icdar-2015 | TESTR | F-measure (%) - Generic Lexicon: 73.6 F-measure (%) - Strong Lexicon: 85.2 F-measure (%) - Weak Lexicon: 79.4 |
| text-spotting-on-inverse-text | TESTR | F-measure (%) - Full Lexicon: 41.6 F-measure (%) - No Lexicon: 34.2 |
| text-spotting-on-scut-ctw1500 | TESTR | F-Measure (%) - Full Lexicon: 81.5 F-measure (%) - No Lexicon: 56.0 |
| text-spotting-on-total-text | TESTR | F-measure (%) - Full Lexicon: 83.9 F-measure (%) - No Lexicon: 73.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.