ScottishFold007/Cosyvoice_DPO_NOTES

CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!

37
/ 100
Emerging

This project helps speech synthesis practitioners and researchers refine existing CosyVoice models to produce more natural, human-like voices across multiple languages. It takes pre-trained CosyVoice models and fine-tunes them using Direct Preference Optimization (DPO), resulting in improved speaker similarity, pronunciation, and responsiveness for applications like virtual assistants or audiobooks. This resource is for anyone working with Text-to-Speech (TTS) who wants to achieve state-of-the-art voice generation.

121 stars. No commits in the last 6 months.

Use this if you are a TTS researcher, voice cloning enthusiast, or real-time TTS application developer looking to enhance the quality and naturalness of synthetic speech from CosyVoice models, especially for multilingual or low-latency scenarios.

Not ideal if you are looking for a completely ready-to-use, off-the-shelf voice generation service without any technical involvement in model fine-tuning.

Speech Synthesis Voice Cloning Text-to-Speech Multilingual Audio Real-time Audio
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 17 / 25

How are scores calculated?

Stars

121

Forks

19

Language

Python

License

Last pushed

Aug 08, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ScottishFold007/Cosyvoice_DPO_NOTES"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.