3 months ago

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Andrew Jaegle Sebastian Borgeaud Jean-Baptiste Alayrac Carl Doersch Catalin Ionescu David Ding Skanda Koppula Daniel Zoran Andrew Brock Evan Shelhamer

Abstract

A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible. Current architectures, however, cannot be applied beyond a small set of stereotyped settings, as they bake in domain & task assumptions or scale poorly to large inputs or outputs. In this work, we propose Perceiver IO, a general-purpose architecture that handles data from arbitrary settings while scaling linearly with the size of inputs and outputs. Our model augments the Perceiver with a flexible querying mechanism that enables outputs of various sizes and semantics, doing away with the need for task-specific architecture engineering. The same architecture achieves strong results on tasks spanning natural language and visual understanding, multi-task and multi-modal reasoning, and StarCraft II. As highlights, Perceiver IO outperforms a Transformer-based BERT baseline on the GLUE language benchmark despite removing input tokenization and achieves state-of-the-art performance on Sintel optical flow estimation with no explicit mechanisms for multiscale correspondence.

Code Repositories

MindSpore-scientific-2/code-7/tree/main/SISDTA

mindspore

MindCode-4/code-2/tree/main/perceiver

mindspore

lucidrains/perceiver-pytorch

pytorch

Mentioned in GitHub

huggingface/transformers

pytorch

Mentioned in GitHub

esceptico/perceiver-io

pytorch

Mentioned in GitHub

deepmind/deepmind-research/tree/master/perceiver

Official

jax

SforAiDl/vformer

pytorch

Mentioned in GitHub

2796gaurav/code_examples/tree/main/Perceiver

krasserm/perceiver-io

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
optical-flow-estimation-on-kitti-2015	Perceiver IO	Average End-Point Error: 4.98
optical-flow-estimation-on-sintel-clean	Perceiver IO	Average End-Point Error: 1.81
optical-flow-estimation-on-sintel-final	Perceiver IO	Average End-Point Error: 2.42

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Perceiver IO: A General Architecture for Structured Inputs & Outputs

Andrew Jaegle Sebastian Borgeaud Jean-Baptiste Alayrac Carl Doersch Catalin Ionescu David Ding Skanda Koppula Daniel Zoran Andrew Brock Evan Shelhamer5 more

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters

Andrew Jaegle Sebastian Borgeaud Jean-Baptiste Alayrac Carl Doersch Catalin Ionescu David Ding Skanda Koppula Daniel Zoran Andrew Brock Evan Shelhamer