HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Deep Fried Convnets

Zichao Yang; Marcin Moczulski; Misha Denil; Nando de Freitas; Alex Smola; Le Song; Ziyu Wang

Deep Fried Convnets

Abstract

The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters. Reducing the number of parameters while preserving essentially the same predictive performance is critically important for operating deep neural networks in memory constrained environments such as GPUs or embedded devices. In this paper we show how kernel methods, in particular a single Fastfood layer, can be used to replace all fully connected layers in a deep convolutional neural network. This novel Fastfood layer is also end-to-end trainable in conjunction with convolutional layers, allowing us to combine them into a new architecture, named deep fried convolutional networks, which substantially reduces the memory footprint of convolutional networks trained on MNIST and ImageNet with no drop in predictive performance.

Code Repositories

v0lta/wavelet-network-compression
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-classification-on-mnistDeep Fried Convnets
Percentage error: 0.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Deep Fried Convnets | Papers | HyperAI