HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Image Chat: Engaging Grounded Conversations

Kurt Shuster; Samuel Humeau; Antoine Bordes; Jason Weston

Image Chat: Engaging Grounded Conversations

Abstract

To achieve the long-term goal of machines being able to engage humans in conversation, our models should captivate the interest of their speaking partners. Communication grounded in images, whereby a dialogue is conducted based on a given photo, is a setup naturally appealing to humans (Hu et al., 2014). In this work we study large-scale architectures and datasets for this goal. We test a set of neural architectures using state-of-the-art image and text representations, considering various ways to fuse the components. To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019). Our dataset, Image-Chat, consists of 202k dialogues over 202k images using 215 possible style traits. Automatic metrics and human evaluations of engagingness show the efficacy of our approach; in particular, we obtain state-of-the-art performance on the existing IGC task, and our best performing model is almost on par with humans on the Image-Chat test set (preferred 47.7% of the time).

Code Repositories

facebookresearch/ParlAI
pytorch
Mentioned in GitHub
Alenush/sirius-spring2021-image2chat
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-retrieval-on-image-chatTransResNet
R@1: 50.3
R@5: 75.4
Sum(R@1,5): 125.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp