WangHelin1997/SSR-Speech
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
This project helps content creators, voice artists, and marketing professionals quickly edit existing speech or generate new speech from text. You input an audio clip and a desired transcript, and it outputs a new audio clip with the speech edited or entirely synthesized. It's designed for anyone needing to modify spoken content or create natural-sounding voiceovers without professional studio equipment.
147 stars. No commits in the last 6 months.
Use this if you need to seamlessly change words in an existing audio recording or create new spoken content from a script, ensuring the output sounds natural and stable, even for challenging edits.
Not ideal if you need fine-grained control over individual phonemes or highly nuanced emotional expression, as its focus is on zero-shot generation and editing rather than detailed linguistic manipulation.
Stars
147
Forks
17
Language
Python
License
MIT
Category
Last pushed
Jan 01, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/WangHelin1997/SSR-Speech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System