Command Palette
Search for a command to run...
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations
Birnbaum Sawyer ; Kuleshov Volodymyr ; Enam Zayd ; Koh Pang Wei ; Ermon Stefano

Abstract
Learning representations that accurately capture long-range dependencies insequential inputs -- including text, audio, and genomic data -- is a keyproblem in deep learning. Feed-forward convolutional models capture onlyfeature interactions within finite receptive fields while recurrentarchitectures can be slow and difficult to train due to vanishing gradients.Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) -- a novelarchitectural component inspired by adaptive batch normalization and itsextensions -- that uses a recurrent neural network to alter the activations ofa convolutional model. This approach expands the receptive field ofconvolutional sequence models with minimal computational overhead. Empirically,we find that TFiLM significantly improves the learning speed and accuracy offeed-forward neural networks on a range of generative and discriminativelearning tasks, including text classification and audio super-resolution
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| audio-super-resolution-on-piano-1 | U-Net + TFiLM | Log-Spectral Distance: 2 |
| audio-super-resolution-on-voice-bank-corpus-1 | U-Net + TFiLM | Log-Spectral Distance: 2.5 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.