FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

/ 100

Established

This project helps create high-quality, natural-sounding voiceovers from written text across many languages and dialects. You provide text, and it generates realistic spoken audio, even allowing for customization of emotion, speed, and volume. This is ideal for content creators, educators, or businesses needing automated voice production for various applications.

19,991 stars. Actively maintained with 6 commits in the last 30 days.

Use this if you need to transform written content into spoken audio with high naturalness and speaker consistency across multiple languages and Chinese dialects, including zero-shot voice cloning.

Not ideal if you require only basic text-to-speech for a single language without advanced customization or high-fidelity output.

voice-generation content-creation e-learning localization audio-production

No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

19,991

Forks

2,270

Language

Python

License

Apache-2.0

Featured in

Things AI Won't Tell You About Building a Voice App

Related tools

travisvn/chatterbox-tts-api

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate...

fishaudio/Bert-VITS2

vits2 backbone with multilingual-bert

sfortis/openai_tts

Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible...

OpenMOSS/MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis....

OpenMOSS/MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights