TTS-Audio-Suite and ComfyUI-VoxCPM

These are complements: TTS-Audio-Suite provides multiple text-to-speech engines and voice conversion options, while VoxCPM specializes in zero-shot voice cloning, allowing users to combine traditional TTS synthesis with advanced voice cloning capabilities in a single ComfyUI workflow.

TTS-Audio-Suite

Established

ComfyUI-VoxCPM

Emerging

Maintenance 25/25

Adoption 10/25

Maturity 15/25

Community 18/25

Maintenance 6/25

Adoption 10/25

Maturity 15/25

Community 16/25

Stars: 774

Forks: 71

Downloads: —

Commits (30d): 79

Language: Python

License: —

Stars: 390

Forks: 42

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

No Package No Dependents

About TTS-Audio-Suite

diodiogod/TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio 2 and VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools

This suite helps video producers, content creators, and educators quickly turn written scripts into natural-sounding speech across many languages and voices. You input your text, choose from various AI voices, and the system generates audio, complete with precise timing for subtitles. It's designed for anyone needing professional-grade voiceovers or narrated content without hiring voice actors.

video-production content-creation localization e-learning audio-narration

About ComfyUI-VoxCPM

wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

This tool helps content creators, podcasters, or marketing professionals generate highly realistic speech from text. You provide text and, optionally, a short audio sample of a voice, and it outputs an audio file with the text spoken in that voice, complete with natural expression and tone. It's designed for anyone needing expressive, true-to-life voiceovers or cloned voices for various media.

voice-cloning audio-content-creation speech-synthesis media-production narration

Related comparisons

TTS-Audio-Suite and VibeVoice-ComfyUI TTS-Audio-Suite and ComfyUI-VibeVoice TTS-Audio-Suite and ComfyUI-EdgeTTS TTS-Audio-Suite and ComfyUI-XTTS TTS-Audio-Suite and ComfyUI-Maya1_TTS TTS-Audio-Suite and ComfyUI-SparkTTS

Scores updated daily from GitHub, PyPI, and npm data. How scores work