Saganaki22/ComfyUI-Step_Audio_EditX_TTS
ComfyUI nodes for Step Audio EditX - State-of-the-art zero-shot voice cloning and audio editing with emotion, style, speed control, and more.
This tool helps creative professionals and content creators generate natural-sounding speech in any voice from just a short audio sample. You provide a text script and a brief voice recording, and it produces new audio spoken in that cloned voice. It also allows you to modify existing audio to change emotion, style, speed, or add effects, making it ideal for podcasters, animators, game developers, or marketers.
Use this if you need to create consistent voiceovers for long-form content, generate character voices for media, or modify audio recordings to express different emotions or styles.
Not ideal if you require editing audio segments longer than 30 seconds for style or emotion, as these need to be manually split first.
Stars
57
Forks
8
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 04, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/Saganaki22/ComfyUI-Step_Audio_EditX_TTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's...
lenML/Speech-AI-Forge
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server...
fishaudio/fish-speech
SOTA Open Source TTS
sidharthrajaram/StyleTTS2
🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
mlalma/kokoro-ios
Kokoro TTS for iOS and macOSX