CMsmartvoice/Unet-TTS
One-shot TTS with Improved Unseen Speaker and Style Transfer
This tool helps content creators and developers generate natural-sounding speech in a specific voice and style, even for voices they haven't encountered before. You provide a short audio sample of the target voice and any text you want spoken, and it outputs synthesized speech that mimics the speaker's unique characteristics and emotional tone. It's ideal for anyone needing to create personalized audio content, like voiceovers or virtual assistants, with minimal effort.
No commits in the last 6 months.
Use this if you need to quickly clone a voice and speaking style from a very short audio clip to synthesize new, arbitrary text.
Not ideal if you require extremely high-fidelity, indistinguishable voice cloning for sensitive applications where even minor artificiality is unacceptable.
Stars
37
Forks
7
Language
—
License
—
Category
Last pushed
Mar 02, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/CMsmartvoice/Unet-TTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System