hayeong0/DDDM-VC
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)
This project helps you change the voice of a spoken audio recording while keeping the original words and meaning intact. You provide an audio file of someone speaking and a target voice, and it generates a new audio file where the original speech is spoken in the target voice. This is useful for anyone working with synthetic speech or audio content creation.
243 stars. No commits in the last 6 months.
Use this if you need to convert speech from one voice to another, for example, to create consistent voiceovers or personalize audio content.
Not ideal if you need to generate speech from text, as this tool requires an existing audio input.
Stars
243
Forks
24
Language
Python
License
—
Category
Last pushed
Jul 31, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/hayeong0/DDDM-VC"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
bghira/SimpleTuner
A general fine-tuning kit geared toward image/video/audio diffusion models.
mcmonkeyprojects/SwarmUI
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an...
nateraw/stable-diffusion-videos
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
TheDesignFounder/DreamLayer
Benchmark diffusion models faster. Automate evals, seeds, and metrics for reproducible results.