Command Palette
Search for a command to run...
Described Object Detection: Liberating Object Detection with Flexible Expressions
Xie Chi ; Zhang Zhao ; Wu Yixuan ; Zhu Feng ; Zhao Rui ; Liang Shuang

Abstract
Detecting objects based on language information is a popular task thatincludes Open-Vocabulary object Detection (OVD) and Referring ExpressionComprehension (REC). In this paper, we advance them to a more practical settingcalled Described Object Detection (DOD) by expanding category names to flexiblelanguage expressions for OVD and overcoming the limitation of REC onlygrounding the pre-existing object. We establish the research foundation for DODby constructing a Description Detection Dataset ($D^3$). This dataset featuresflexible language expressions, whether short category names or longdescriptions, and annotating all described objects on all images withoutomission. By evaluating previous SOTA methods on $D^3$, we find sometroublemakers that fail current REC, OVD, and bi-functional methods. RECmethods struggle with confidence scores, rejecting negative instances, andmulti-target scenarios, while OVD methods face constraints with long andcomplex descriptions. Recent bi-functional methods also do not work well on DODdue to their separated training procedures and inference strategies for REC andOVD tasks. Building upon the aforementioned findings, we propose a baselinethat largely improves REC methods by reconstructing the training data andintroducing a binary classification sub-task, outperforming existing methods.Data and code are available at https://github.com/shikras/d-cube and relatedworks are tracked inhttps://github.com/Charles-Xie/awesome-described-object-detection.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| described-object-detection-on-description | OFA-DOD-base | Intra-scenario ABS mAP: 15.4 Intra-scenario FULL mAP: 21.6 Intra-scenario PRES mAP: 23.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.