HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Stacked Attention Networks for Image Question Answering

Zichao Yang; Xiaodong He; Jianfeng Gao; Li Deng; Alex Smola

Stacked Attention Networks for Image Question Answering

Abstract

This paper presents stacked attention networks (SANs) that learn to answer natural language questions from images. SANs use semantic representation of a question as query to search for the regions in an image that are related to the answer. We argue that image question answering (QA) often requires multiple steps of reasoning. Thus, we develop a multiple-layer SAN in which we query an image multiple times to infer the answer progressively. Experiments conducted on four image QA data sets demonstrate that the proposed SANs significantly outperform previous state-of-the-art approaches. The visualization of the attention layers illustrates the progress that the SAN locates the relevant visual clues that lead to the answer of the question layer-by-layer.

Code Repositories

zcyang/imageqa-san
Mentioned in GitHub
chirag26495/DAN_VQA
pytorch
Mentioned in GitHub
mokhalid-dev/Attention-based-VQA-model
pytorch
Mentioned in GitHub
Cold-Winter/vqs
caffe2
Mentioned in GitHub
yanxinyan1/yxy
pytorch
Mentioned in GitHub
abhi-iyer/visual-question-answering
pytorch
Mentioned in GitHub
SatyamGaba/vqa
pytorch
Mentioned in GitHub
SatyamGaba/visual_question_answering
pytorch
Mentioned in GitHub
snagiri/ECE285_Jarvis_ProjectA
pytorch
Mentioned in GitHub
TingAnChien/san-vqa-tensorflow
tf
Mentioned in GitHub
rs9000/VisualReasoning_MMnet
pytorch
Mentioned in GitHub
jiayi-wei/vqa-tf2
tf
Mentioned in GitHub
abhshkdz/neural-vqa-attention
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
visual-question-answering-on-coco-visual-4SAN
Percentage correct: 58.9
visual-question-answering-on-vqa-v1-test-stdSAN (VGG)
Accuracy: 58.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp