8 months ago

Abstract

This paper takes an important step in bridging the performance gap betweenDETR and R-CNN for graphical object detection. Existing graphical objectdetection approaches have enjoyed recent enhancements in CNN-based objectdetection methods, achieving remarkable progress. Recently, Transformer-baseddetectors have considerably boosted the generic object detection performance,eliminating the need for hand-crafted features or post-processing steps such asNon-Maximum Suppression (NMS) using object queries. However, the effectivenessof such enhanced transformer-based detection algorithms has yet to be verifiedfor the problem of graphical object detection. Essentially, inspired by thelatest advancements in the DETR, we employ the existing detection transformerwith few modifications for graphical object detection. We modify object queriesin different ways, using points, anchor boxes and adding positive and negativenoise to the anchors to boost performance. These modifications allow for betterhandling of objects with varying sizes and aspect ratios, more robustness tosmall variations in object positions and sizes, and improved imagediscrimination between objects and non-objects. We evaluate our approach on thefour graphical datasets: PubTables, TableBank, NTable and PubLaynet. Uponintegrating query modifications in the DETR, we outperform prior works andachieve new state-of-the-art results with the mAP of 96.9%, 95.7% and 99.3%on TableBank, PubLaynet, PubTables, respectively. The results from extensiveablations show that transformer-based methods are more effective for documentanalysis analogous to other applications. We hope this study draws moreattention to the research of using detection transformers in document imageanalysis.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

8 months ago

Object Detection

Document Understanding

Object Tracking

Natural Language Processing

Computer Vision

Task/Problem

Tahira Shehzadi Khurram Azeem Hashmi Didier Stricker Marcus Liwicki Muhammad Zeshan Afzal

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

8 months ago

Object Detection

Document Understanding

Object Tracking

Natural Language Processing

Computer Vision

Task/Problem

Tahira Shehzadi Khurram Azeem Hashmi Didier Stricker Marcus Liwicki Muhammad Zeshan Afzal

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

Tahira Shehzadi Khurram Azeem Hashmi Didier Stricker Marcus Liwicki Muhammad Zeshan Afzal

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

Tahira Shehzadi Khurram Azeem Hashmi Didier Stricker Marcus Liwicki Muhammad Zeshan Afzal

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

Tahira Shehzadi Khurram Azeem Hashmi Didier Stricker Marcus Liwicki Muhammad Zeshan Afzal

Abstract

Build AI with AI

HyperAI Newsletters