HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

AOG-LSTM: An adaptive attention neural network for visual storytelling

{and Wei Wu Rui Xie Hui Wang Yong Jiang Hai-Tao Zheng Wei Wang Chia-Hao Chang Jiacheng Yang Hanqing Liu}

Abstract

Visual storytelling is the task of generating a related story for a given image sequence, which has received significant attention. However, using general RNNs (such as LSTM and GRU) as the decoder limit the performance of the models in this task. This is because they can not differentiate different types of information representations. In addition, optimizing the probabilities of subsequent words conditioned on the previous ground-truth sequences can cause error accumulation during inference. Moreover, the existing method of alleviating error accumulation based on replacing reference words does not take into account the different effects of each word. To address the above problems, we propose a modified neural network named AOG-LSTM and a modified training strategy named ARS, respectively. AOG-LSTM can adaptatively pay appropriate attention to different information representations within it when predicting different words. During training, ARS replaces some words in the reference sentences with model predictions similar to the existing method. However, we utilize the selection network and selection strategy to select more appropriate words for the replacement to better improve the model. Experiments on the VIST Dataset demonstrate that our model outperforms several strong baselines on the most commonly used metrics.

Benchmarks

BenchmarkMethodologyMetrics
visual-storytelling-on-vistAOG + ARS
BLEU-1: 69
BLEU-2: 44
BLEU-3: 23.9
BLEU-4: 12.9
CIDEr: 12.0
METEOR: 36.0
ROUGE-L: 30.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AOG-LSTM: An adaptive attention neural network for visual storytelling | Papers | HyperAI