shijincai/VibeVoice
Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup of the deleted source code for the open-source TTS models, including the removed 7B version. Try the VibeVoice online service
This project helps create expressive, long-form conversational audio from text, such as podcasts or multi-speaker dialogues. You provide written text, and it generates natural-sounding speech, capable of handling up to four distinct speakers and up to 90 minutes of audio. It is ideal for content creators, podcasters, or anyone needing to transform written content into high-quality spoken audio.
No commits in the last 6 months.
Use this if you need to generate realistic, multi-speaker conversational audio from text for podcasts, audiobooks, or long-form narrated content.
Not ideal if you require precise control over background music or sound effects, as these can appear spontaneously based on the input text and voice prompts.
Stars
27
Forks
27
Language
Python
License
MIT
Category
Last pushed
Sep 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/shijincai/VibeVoice"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
BoltzmannEntropy/MimikaStudio
MimikaStudio - A local-first application for macOS (Apple Silicon) + Agentic MCP Support
aahl/qwen-asr2api
🎤 Qwen 3 ASR to OpenAI API, 免费STT语音识别模型
gabriele-mastrapasqua/qwen3-tts
Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch — just C and BLAS....
zhao-kun/VibeVoiceFusion
VibeVoiceFusion is a full-stack, multi-speaker voice generation web system featuring LoRA...
talin190/Qwen3-TTS-Daggr-UI
🎤 Create dynamic voice experiences with Qwen3-TTS-Daggr-UI, a Gradio app for voice design,...