gpustack/vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.
This tool allows developers to quickly set up a server for converting spoken audio into written text or turning written text into natural-sounding speech. You input audio files or written text, and it outputs the corresponding text transcriptions or audio narration. It's designed for developers building applications that need robust speech recognition or text-to-speech capabilities, such as voice assistants or content creation tools.
200 stars.
Use this if you are a developer integrating text-to-speech or speech-to-text functionality into an application and need a local server solution.
Not ideal if you are an end-user looking for a ready-to-use application with a graphical interface for transcribing audio or generating speech.
Stars
200
Forks
32
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gpustack/vox-box"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible...
jamiepine/voicebox
The open-source voice synthesis studio
daswer123/xtts-api-server
A simple FastAPI Server to run XTTSv2
Aivis-Project/AivisSpeech-Engine
AivisSpeech Engine: AI Voice Imitation System - Text to Speech Engine
jianchang512/ChatTTS-ui
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to...