zhao-kun/VibeVoiceFusion

VibeVoiceFusion is a full-stack, multi-speaker voice generation web system featuring LoRA fine-tuning, batch generation, and VRAM optimization. Based on Microsoft's VibeVoice (AR + diffusion architecture)

/ 100

Emerging

This web application helps content creators, educators, or marketers generate high-quality, natural-sounding synthetic speech from text. You input written scripts and reference voice samples, and it outputs custom audio files with distinct voices, supporting multiple speakers for dialogues or single narration. It's designed for anyone needing professional voiceovers without hiring voice actors.

453 stars.

Use this if you need to quickly create synthetic speech, clone voices, or generate multi-speaker dialogues for various content types, even with limited GPU resources.

Not ideal if you need to create voices from scratch without any reference audio or if your projects demand extremely short audio segments where latency is critical.

voiceover-production content-creation audiobook-narration e-learning marketing-materials

No License No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 7 / 25

Community 18 / 25

How are scores calculated?

Stars

453

Forks

Language

Python

License

—

Higher-rated alternatives

BoltzmannEntropy/MimikaStudio

MimikaStudio - A local-first application for macOS (Apple Silicon) + Agentic MCP Support

aahl/qwen-asr2api

🎤 Qwen 3 ASR to OpenAI API, 免费STT语音识别模型

gabriele-mastrapasqua/qwen3-tts

Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch — just C and BLAS....

shijincai/VibeVoice

Archive of the official Microsoft VibeVoice repository (7B & 1.5B). Backup of the deleted source...

talin190/Qwen3-TTS-Daggr-UI

🎤 Create dynamic voice experiences with Qwen3-TTS-Daggr-UI, a Gradio app for voice design,...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights