abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

/ 100

Established

Voice-Pro helps content creators, podcasters, and multilingual professionals process audio and video. It takes spoken audio (from uploaded files or YouTube videos) and provides accurate speech-to-text transcription, translation into over 100 languages, and converts text back into speech using various voices, including cloned ones. This tool is for anyone needing to create multilingual content or process audio efficiently.

6,366 stars.

Use this if you need to transcribe, translate, or generate speech from text for videos, podcasts, or other audio content, especially across multiple languages.

Not ideal if you are looking for advanced music separation or AI singing voice conversion, as these features have been removed.

multilingual-content-creation video-localization podcast-production audio-transcription voice-dubbing

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

6,366

Forks

687

Language

Python

License

GPL-3.0

Related tools

snakers4/silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

JSchmie/ScrAIbe-WebUI

WebUI for ScAIbe

isaiahbjork/orpheus-tts-local

Run Orpheus 3B Locally With LM Studio

snakers4/silero-stress

Silero Stress — pre-trained enterprise-grade automated stress and homograph disambiguation for...

MerlinCN/kinoko7danmaku

调用TTS来播报哔哩哔哩直播中的弹幕、礼物、舰长等

Explore Voice AI Tools

All categories Trending Voice AI directory Insights