Chatterbox-TTS-Server and vox-box
About Chatterbox-TTS-Server
devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.
This tool helps you convert written text into high-quality spoken audio using various voices and languages. You provide text and, optionally, a voice to clone, and it outputs realistic speech, even for long documents like audiobooks. It's designed for content creators, marketers, educators, and anyone needing to generate expressive voiceovers or audio content.
About vox-box
gpustack/vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.
This tool allows developers to quickly set up a server for converting spoken audio into written text or turning written text into natural-sounding speech. You input audio files or written text, and it outputs the corresponding text transcriptions or audio narration. It's designed for developers building applications that need robust speech recognition or text-to-speech capabilities, such as voice assistants or content creation tools.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work