4 months ago

Computer Vision

Image Understanding

Convolutional Neural Network

Method/Architecture

Computer Vision

Li Ruiyu Tapaswi Makarand Liao Renjie Jia Jiaya Urtasun Raquel

Abstract

We address the problem of recognizing situations in images. Given an image,the task is to predict the most salient verb (action), and fill its semanticroles such as who is performing the action, what is the source and target ofthe action, etc. Different verbs have different roles (e.g. attacking hasweapon), and each role can take on many possible values (nouns). We propose amodel based on Graph Neural Networks that allows us to efficiently capturejoint dependencies between roles using neural networks defined on a graph.Experiments with different graph connectivities show that our approach thatpropagates information between roles significantly outperforms existing work,as well as multiple baselines. We obtain roughly 3-5% improvement over previouswork in predicting the full situation. We also provide a thorough qualitativeanalysis of our model and influence of different roles in the verbs.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

4 months ago

Computer Vision

Image Understanding

Convolutional Neural Network

Method/Architecture

Computer Vision

Li Ruiyu Tapaswi Makarand Liao Renjie Jia Jiaya Urtasun Raquel

Abstract

We address the problem of recognizing situations in images. Given an image,the task is to predict the most salient verb (action), and fill its semanticroles such as who is performing the action, what is the source and target ofthe action, etc. Different verbs have different roles (e.g. attacking hasweapon), and each role can take on many possible values (nouns). We propose amodel based on Graph Neural Networks that allows us to efficiently capturejoint dependencies between roles using neural networks defined on a graph.Experiments with different graph connectivities show that our approach thatpropagates information between roles significantly outperforms existing work,as well as multiple baselines. We obtain roughly 3-5% improvement over previouswork in predicting the full situation. We also provide a thorough qualitativeanalysis of our model and influence of different roles in the verbs.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp