shijincai/VibeVoice

Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup of the deleted source code for the open-source TTS models, including the removed 7B version. Try the VibeVoice online service

/ 100

Emerging

This project helps create expressive, long-form conversational audio from text, such as podcasts or multi-speaker dialogues. You provide written text, and it generates natural-sounding speech, capable of handling up to four distinct speakers and up to 90 minutes of audio. It is ideal for content creators, podcasters, or anyone needing to transform written content into high-quality spoken audio.

No commits in the last 6 months.

Use this if you need to generate realistic, multi-speaker conversational audio from text for podcasts, audiobooks, or long-form narrated content.

Not ideal if you require precise control over background music or sound effects, as these can appear spontaneously based on the input text and voice prompts.

podcasting audiobook-creation content-creation speech-synthesis voice-over

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 15 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

BoltzmannEntropy/MimikaStudio

MimikaStudio - A local-first application for macOS (Apple Silicon) + Agentic MCP Support

aahl/qwen-asr2api

🎤 Qwen 3 ASR to OpenAI API, 免费STT语音识别模型

gabriele-mastrapasqua/qwen3-tts

Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch — just C and BLAS....

zhao-kun/VibeVoiceFusion

VibeVoiceFusion is a full-stack, multi-speaker voice generation web system featuring LoRA...

talin190/Qwen3-TTS-Daggr-UI

🎤 Create dynamic voice experiences with Qwen3-TTS-Daggr-UI, a Gradio app for voice design,...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights