8 months ago

Abstract

The self-attention-based model, transformer, is recently becoming the leadingbackbone in the field of computer vision. In spite of the impressive successmade by transformers in a variety of vision tasks, it still suffers from heavycomputation and intensive memory costs. To address this limitation, this paperpresents an Interpretability-Aware REDundancy REDuction framework (IA-RED $^2$ ).We start by observing a large amount of redundant computation, mainly spent onuncorrelated input patches, and then introduce an interpretable module todynamically and gracefully drop these redundant patches. This novel frameworkis then extended to a hierarchical structure, where uncorrelated tokens atdifferent stages are gradually removed, resulting in a considerable shrinkageof computational cost. We include extensive experiments on both image and videotasks, where our method could deliver up to 1.4x speed-up for state-of-the-artmodels like DeiT and TimeSformer, by only sacrificing less than 0.7% accuracy.More importantly, contrary to other acceleration approaches, our method isinherently interpretable with substantial visual evidence, making visiontransformer closer to a more human-understandable architecture while beinglighter. We demonstrate that the interpretability that naturally emerged in ourframework can outperform the raw attention learned by the original visualtransformer, as well as those generated by off-the-shelf interpretationmethods, with both qualitative and quantitative results. Project Page:http://people.csail.mit.edu/bpan/ia-red/.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Bowen Pan¹, Rameswar Panda², Yifan Jiang³, Zhangyang Wang³, Rogerio Feris², Aude Oliva¹,²

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Bowen Pan¹, Rameswar Panda², Yifan Jiang³, Zhangyang Wang³, Rogerio Feris², Aude Oliva¹,²

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers

Bowen Pan¹, Rameswar Panda², Yifan Jiang³, Zhangyang Wang³, Rogerio Feris², Aude Oliva¹,²

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers

Bowen Pan¹, Rameswar Panda², Yifan Jiang³, Zhangyang Wang³, Rogerio Feris², Aude Oliva¹,²

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers

Bowen Pan¹, Rameswar Panda², Yifan Jiang³, Zhangyang Wang³, Rogerio Feris², Aude Oliva¹,²

Abstract

Build AI with AI

HyperAI Newsletters

IA-RED $^2$ : Interpretability-Aware Redundancy Reduction for Vision Transformers

IA-RED $^2$ : Interpretability-Aware Redundancy Reduction for Vision Transformers

IA-RED $^2$ : Interpretability-Aware Redundancy Reduction for Vision Transformers