seahore/PPG-GradVC
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis
This project helps convert a person's voice into different voices while keeping the original speech content. You provide an audio recording and it generates new audio that sounds like a different speaker, potentially in another language. This is useful for anyone working with audio content who needs to change the speaker's identity or language without re-recording.
No commits in the last 6 months.
Use this if you need to transform the voice in an audio file to sound like someone else, possibly across different languages, without altering the spoken words.
Not ideal if you are looking for a tool to transcribe audio to text, translate spoken content, or synthesize speech from text.
Stars
44
Forks
6
Language
Python
License
—
Category
Last pushed
Jul 24, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/seahore/PPG-GradVC"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster,...
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...