HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning

{Jinwen Tian Min Yang Xiang Ao Chengming Li Yu Li Chunpu Xu}

Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning

Abstract

Image paragraph captioning (IPC) aims to generate a fine-grained paragraph to describe the visual content of an image. Significant progress has been made by deep neural networks, in which the attention mechanism plays an essential role. However, conventional attention mechanisms tend to ignore the past alignment information, which often results in problems of repetitive captioning and incomplete captioning. In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (salient objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions. In addition, we employ an adaptive attention mechanism to realize adaptive alignment from image regions to caption words, where an image region can be mapped to an arbitrary number of caption words while a caption word can also attend to an arbitrary number of image regions. Extensive experiments on a benchmark dataset (i.e., Stanford) demonstrate the effectiveness of our IMAP model.

Benchmarks

BenchmarkMethodologyMetrics
image-paragraph-captioning-on-image-paragraphIMAP
BLEU-4: 10.29
CIDEr: 24.07
METEOR: 17.36

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning | Papers | HyperAI