Command Palette
Search for a command to run...
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining
Nguyen Viet-Anh ; Nguyen Anh H. T. ; Khong Andy W. H.

Abstract
We introduce a block-online variant of the temporal feature-wise linearmodulation (TFiLM) model to achieve bandwidth extension. The proposedarchitecture simplifies the UNet backbone of the TFiLM to reduce inference timeand employs an efficient transformer at the bottleneck to alleviate performancedegradation. We also utilize self-supervised pretraining and data augmentationto enhance the quality of bandwidth extended signals and reduce the sensitivitywith respect to downsampling methods. Experiment results on the VCTK datasetshow that the proposed method outperforms several recent baselines in bothintrusive and non-intrusive metrics. Pretraining and filter augmentation alsohelp stabilize and enhance the overall performance.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| audio-super-resolution-on-vctk-multi-speaker-1 | TUNet + MSM pre-training | Log-Spectral Distance: 1.28 |
| audio-super-resolution-on-vctk-multi-speaker-1 | TUNet | Log-Spectral Distance: 1.36 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.