ScottishFold007/Cosyvoice_DPO_NOTES

CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!

/ 100

Emerging

This project helps speech synthesis practitioners and researchers refine existing CosyVoice models to produce more natural, human-like voices across multiple languages. It takes pre-trained CosyVoice models and fine-tunes them using Direct Preference Optimization (DPO), resulting in improved speaker similarity, pronunciation, and responsiveness for applications like virtual assistants or audiobooks. This resource is for anyone working with Text-to-Speech (TTS) who wants to achieve state-of-the-art voice generation.

121 stars. No commits in the last 6 months.

Use this if you are a TTS researcher, voice cloning enthusiast, or real-time TTS application developer looking to enhance the quality and naturalness of synthetic speech from CosyVoice models, especially for multilingual or low-latency scenarios.

Not ideal if you are looking for a completely ready-to-use, off-the-shelf voice generation service without any technical involvement in model fine-tuning.

Speech Synthesis Voice Cloning Text-to-Speech Multilingual Audio Real-time Audio

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

121

Forks

Language

Python

License

—

Higher-rated alternatives

herimor/voxtream

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control

EveryVoiceTTS/EveryVoice

The EveryVoice TTS Toolkit - Text To Speech for your language

thorstenMueller/Thorsten-Voice

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be...

daswer123/xtts-webui

Webui for using XTTS and for finetuning it

kadirnar/VoiceHub

VoiceHub: A Unified Inference Interface for TTS Models

Explore Voice AI Tools

All categories Trending Voice AI directory Insights