LuxTTS

This is the model for LuxTTS, a lightweight zipvoice based text-to-speech model designed for high quality voice cloning and realistic generation at speeds exceeding 150x realtime.

Main features

Voice cloning: SOTA voice cloning on par with models 10x larger.
Clarity: Clear 48khz speech generation unlike most TTS models which are limited to 24khz.
Speed: Reaches speeds of 150x realtime on a single GPU and faster then realtime on CPU's as well.
Efficiency: Fits within 1gb vram meaning it can fit in any local gpu.

Details

Based on ZipVoice, distilled to 4steps.
Uses 48khz vocoder instead of 24khz vocoder.
Implemented higher quality sampling technique then standard euler.

Usage

Please check out the repo for usage: https://github.com/ysharma3501/LuxTTS.git

License

Model and code is released under Apache-2.0 license.

If you find the model/code helpful, stars or likes would be appreciated. Thank you.

Downloads last month: 77

YatharthS
/

LuxTTS

LuxTTS

Main features

Details

Usage

License

Spaces using YatharthS/LuxTTS 2