hcy71o/SC-CNN
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems
This project helps create high-quality, natural-sounding speech from text using the voice of a speaker it has never heard before. By providing a short audio sample of any speaker's voice, it generates speech in that voice from your written text. This is ideal for content creators, audiobook producers, or anyone needing custom voice narration without hiring a voice actor.
No commits in the last 6 months.
Use this if you need to generate spoken audio in a variety of voices from text, especially if you need to mimic a specific voice from a small audio sample.
Not ideal if you primarily need to transcribe audio to text or analyze existing speech for insights, as its focus is on speech generation.
Stars
39
Forks
7
Language
Python
License
MIT
Category
Last pushed
Nov 01, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/hcy71o/SC-CNN"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System