HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Reflective Decoding Network for Image Captioning

Lei Ke; Wenjie Pei; Ruiyu Li; Xiaoyong Shen; Yu-Wing Tai

Reflective Decoding Network for Image Captioning

Abstract

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance. In this paper, we show that vocabulary coherence between words and syntactic paradigm of sentences are also important to generate high-quality image caption. Following the conventional encoder-decoder framework, we propose the Reflective Decoding Network (RDN) for image captioning, which enhances both the long-sequence dependency and position perception of words in a caption decoder. Our model learns to collaboratively attend on both visual and textual features and meanwhile perceive each word's relative position in the sentence to maximize the information delivered in the generated caption. We evaluate the effectiveness of our RDN on the COCO image captioning datasets and achieve superior performance over the previous methods. Further experiments reveal that our approach is particularly advantageous for hard cases with complex scenes to describe by captions.

Benchmarks

BenchmarkMethodologyMetrics
image-captioning-on-cocoRDN
CIDEr: 125.2
image-captioning-on-coco-captionsRDN
BLEU-1: 80.2
BLEU-4: 37.3
CIDER: 125.2
METEOR: 28.1
ROUGE-L: 57.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Reflective Decoding Network for Image Captioning | Papers | HyperAI