abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Voice-Pro helps content creators, podcasters, and multilingual professionals process audio and video. It takes spoken audio (from uploaded files or YouTube videos) and provides accurate speech-to-text transcription, translation into over 100 languages, and converts text back into speech using various voices, including cloned ones. This tool is for anyone needing to create multilingual content or process audio efficiently.
6,366 stars.
Use this if you need to transcribe, translate, or generate speech from text for videos, podcasts, or other audio content, especially across multiple languages.
Not ideal if you are looking for advanced music separation or AI singing voice conversion, as these features have been removed.
Stars
6,366
Forks
687
Language
Python
License
GPL-3.0
Category
Last pushed
Dec 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/abus-aikorea/voice-pro"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
snakers4/silero-models
Silero Models: pre-trained text-to-speech models made embarrassingly simple
JSchmie/ScrAIbe-WebUI
WebUI for ScAIbe
isaiahbjork/orpheus-tts-local
Run Orpheus 3B Locally With LM Studio
snakers4/silero-stress
Silero Stress — pre-trained enterprise-grade automated stress and homograph disambiguation for...
MerlinCN/kinoko7danmaku
调用TTS来播报哔哩哔哩直播中的弹幕、礼物、舰长等