Command Palette
Search for a command to run...
Scene Text Recognition On Wost
Metrics
1:1 Accuracy
Results
Performance results of various models on this benchmark
| Paper Title | Repository | ||
|---|---|---|---|
| CLIP4STR-H (DFN-5B) | 90.9 | CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | |
| CLIP4STR-L (DataComp-1B) | 90.6 | CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | |
| CLIP4STR-L | 88.8 | CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | |
| CLIP4STR-B | 87.0 | CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | |
| CCD-ViT-Base | 86.0 | Self-supervised Character-to-Character Distillation for Text Recognition |
0 of 5 row(s) selected.