HyperAIHyperAI

Command Palette

Search for a command to run...

NeuTTS-Air: A Lightweight and Efficient Voice Cloning Model

1. Tutorial Introduction

Stars

NeuTTS-Air is an end-to-end speech synthesis model (TTS) released by Neuphonic in October 2025. Based on the 0.5B Qwen LLM backbone and NeuCodec audio codec, it demonstrates few-shot learning capabilities in on-device deployment and instant voice cloning. System evaluation shows that NeuTTS Air has reached the SOTA level among open source models, especially in ultra-realistic synthesis and real-time inference benchmarks. It can also generalize to new scenarios such as embedded agents and style transfer, support 3-second audio cloning, and generate natural conversation content. Post-training introduces GGML/ONNX support and watermarking mechanism, leading the open source field in on-device TTS and power optimization evaluation, and some scenarios are comparable to closed-source models.

This tutorial uses a single RTX 5090 card as the resource, and the model only supports English.

2. Project Examples

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Once you enter the webpage, you can use the model

If "Bad Gateway" is displayed, it means that the code is executing in the background. Please wait about 2-3 minutes and refresh the page.

When using the Safari browser, the audio may not be played directly and needs to be downloaded before playing.

How to use

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
NeuTTS-Air: A Lightweight and Efficient Voice Cloning Model | Tutorials | HyperAI