ScottishFold007/Cosyvoice_DPO_NOTES
CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!
This project helps speech synthesis practitioners and researchers refine existing CosyVoice models to produce more natural, human-like voices across multiple languages. It takes pre-trained CosyVoice models and fine-tunes them using Direct Preference Optimization (DPO), resulting in improved speaker similarity, pronunciation, and responsiveness for applications like virtual assistants or audiobooks. This resource is for anyone working with Text-to-Speech (TTS) who wants to achieve state-of-the-art voice generation.
121 stars. No commits in the last 6 months.
Use this if you are a TTS researcher, voice cloning enthusiast, or real-time TTS application developer looking to enhance the quality and naturalness of synthetic speech from CosyVoice models, especially for multilingual or low-latency scenarios.
Not ideal if you are looking for a completely ready-to-use, off-the-shelf voice generation service without any technical involvement in model fine-tuning.
Stars
121
Forks
19
Language
Python
License
—
Category
Last pushed
Aug 08, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ScottishFold007/Cosyvoice_DPO_NOTES"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
herimor/voxtream
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
EveryVoiceTTS/EveryVoice
The EveryVoice TTS Toolkit - Text To Speech for your language
thorstenMueller/Thorsten-Voice
Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be...
daswer123/xtts-webui
Webui for using XTTS and for finetuning it
kadirnar/VoiceHub
VoiceHub: A Unified Inference Interface for TTS Models