HyperAIHyperAI

Command Palette

Search for a command to run...

CounTR: Transformer-based Generalised Visual Counting

Liu Chang ; Zhong Yujie ; Zisserman Andrew ; Xie Weidi

Abstract

In this paper, we consider the problem of generalised visual object counting,with the goal of developing a computational model for counting the number ofobjects from arbitrary semantic categories, using arbitrary number of"exemplars", i.e. zero-shot or few-shot counting. To this end, we make thefollowing four contributions: (1) We introduce a novel transformer-basedarchitecture for generalised visual object counting, termed as CountingTransformer (CounTR), which explicitly capture the similarity between imagepatches or with given "exemplars" with the attention mechanism;(2) We adopt atwo-stage training regime, that first pre-trains the model with self-supervisedlearning, and followed by supervised fine-tuning;(3) We propose a simple,scalable pipeline for synthesizing training images with a large number ofinstances or that from different semantic categories, explicitly forcing themodel to make use of the given "exemplars";(4) We conduct thorough ablationstudies on the large-scale counting benchmark, e.g. FSC-147, and demonstratestate-of-the-art performance on both zero and few-shot settings.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp