Speech Synthesis Diffusion Diffusion Models
Diffusion models for speech and audio generation including TTS, voice conversion, singing synthesis, and vocoding. Does NOT include general image diffusion, music generation without speech focus, or non-diffusion audio processing.
There are 55 speech synthesis diffusion models tracked. 2 score above 50 (established tier). The highest-rated is PrunaAI/pruna at 63/100 with 1,142 stars. 1 of the top 10 are actively maintained.
Get all 55 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=speech-synthesis-diffusion&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you... |
|
Established |
| 2 |
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync! |
|
Established |
| 3 |
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference. |
|
Emerging |
| 4 |
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio... |
|
Emerging |
| 5 |
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to... |
|
Emerging |
| 6 |
ivanvovk/WaveGrad
Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch. |
|
Emerging |
| 7 |
Rongjiehuang/ProDiff
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast... |
|
Emerging |
| 8 |
keonlee9420/DiffSinger
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow... |
|
Emerging |
| 9 |
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient... |
|
Emerging |
| 10 |
sayakpaul/diffusers-torchao
End-to-end recipes for optimizing diffusion models with torchao and... |
|
Emerging |
| 11 |
Aratako/Irodori-TTS
A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control |
|
Emerging |
| 12 |
yochaiye/LipVoicer
Official Code implementation for the ICLR paper "LipVoicer: Generating... |
|
Emerging |
| 13 |
segmind/distill-sd
Segmind Distilled diffusion |
|
Emerging |
| 14 |
zhenye234/CoMoSpeech
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via... |
|
Emerging |
| 15 |
huggingface/diffusion-fast
Faster generation with text-to-image diffusion models. |
|
Emerging |
| 16 |
sony/soundctm
Pytorch implementation of SoundCTM |
|
Emerging |
| 17 |
trinhtuanvubk/Diff-VC
Diffusion Model for Voice Conversion |
|
Emerging |
| 18 |
G-U-N/Phased-Consistency-Model
[NeurIPS 2024] Boosting the performance of consistency models with PCM! |
|
Emerging |
| 19 |
junhsss/consistency-models
A Toolkit for OpenAI's Consistency Models. |
|
Emerging |
| 20 |
xandergos/sCM-mnist
Unofficial implementation of "Simplifying, Stabilizing & Scaling... |
|
Emerging |
| 21 |
mazumdarsoumya/TempoSyncDiff
Few-step diffusion for audio-driven talking head generation making diffusion... |
|
Emerging |
| 22 |
TencentARC/AudioStory
AudioStory: Generating Long-Form Narrative Audio with Large Language Models |
|
Emerging |
| 23 |
FireRedTeam/Target-Driven-Distillation
Consistency Distillation with Target Timestep Selection and Decoupled Guidance |
|
Emerging |
| 24 |
koichi-saito-sony/soundctm_dit_iclr
Pytorch implementation of SoundCTM-DiT |
|
Emerging |
| 25 |
JiauZhang/binary-latent-diffusion
Implementation of Binary Latent Diffusion |
|
Emerging |
| 26 |
hayeong0/Diff-HierVC
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based... |
|
Emerging |
| 27 |
0x7o/DeepMozart
Audio generation using diffusion models |
|
Emerging |
| 28 |
mbreuss/consistency_models_toy_task
Unofficial minimal implementation of consistency models (CM) proposed by... |
|
Emerging |
| 29 |
MirageML/MirageStock
Open-Source Implementations of Multi-Modal Diffusion Models Optimized for... |
|
Emerging |
| 30 |
ashutosh1919/consistency-models
Ready to run PyTorch implementation of Consistency Models: One-Step Image... |
|
Emerging |
| 31 |
OpenGVLab/LORIS
[ICML2023] Long-Term Rhythmic Video Soundtracker |
|
Experimental |
| 32 |
seahore/PPG-GradVC
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis |
|
Experimental |
| 33 |
drakyanerlanggarizkiwardhana/Diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio... |
|
Experimental |
| 34 |
jabir-zheng/TCD
Official Repository of the paper "Trajectory Consistency Distillation" |
|
Experimental |
| 35 |
smsharma/consistency-models
Implementation of Consistency Models (Song et al 2023) for few-step image... |
|
Experimental |
| 36 |
Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation |
|
Experimental |
| 37 |
AxiumCrisis61/StableSVC
StableSVC: Latent Diffusion Model for Singing Voice Conversion (originally... |
|
Experimental |
| 38 |
testzer0/GradTTS-unoffical
My unofficial implementation of Grad-TTS (ICML 2021) |
|
Experimental |
| 39 |
Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with... |
|
Experimental |
| 40 |
romanycc/Audio-Diffusion
Audio Diffusion |
|
Experimental |
| 41 |
LiangXu123/Robust-One-step-Speech-Enhancement-via-Consistency-Distillation-ROSE-CD-
Robust One-step Speech Enhancement via Consistency Distillation... |
|
Experimental |
| 42 |
mbreuss/consistency_trajectory_models_toy_task
Minimal unofficial implementation of Consistency Trajectory models on a 1D toy task. |
|
Experimental |
| 43 |
juanalonso/diffusion-audio
Lista de modelos y aplicaciones basadas en diffusion |
|
Experimental |
| 44 |
slegroux/nimrod
minimal deep learning framework |
|
Experimental |
| 45 |
quickgrid/distill-sd
Experiment with latent diffusion models. |
|
Experimental |
| 46 |
minyoungpark1/Speech-Enhancement
Unofficial implementation of SCP-GAN |
|
Experimental |
| 47 |
jwliao1209/DiffMusic
🎼 DiffMusic: A Training-Free Diffusion Framework for Music Inverse Problem |
|
Experimental |
| 48 |
instill-ai/model-diffusion-dvc
⚗️ Diffusion model repository based on HuggingFace Diffusion 2.1 managed by DVC |
|
Experimental |
| 49 |
michalsvento/UnNAFx
Supplementary code for paper submitted to DAFx 2025 |
|
Experimental |
| 50 |
Jason-cs18/HetServe-Foundation
A Overview of Efficiently Serving Foundation Models across Edge Devices |
|
Experimental |
| 51 |
Shiying-Zhang/-diffusion-model-genealogy
🧬 Diffusion Model Genealogy - Mapping the family relationships between... |
|
Experimental |
| 52 |
XinleiNIU/SoundMorpher
This is implementation code for "SoundMorpher: Perceptually-Uniform Sound... |
|
Experimental |
| 53 |
7-4-7/BirdGen
Implementation of classifier guided diiffusion model on a procedurally... |
|
Experimental |
| 54 |
manthan89-py/OpenSource-Diffusion-Models-Experiment
This repo analyzes Open Source Diffusion models for generating... |
|
Experimental |
| 55 |
VladimirZelenokor1/ML-Project---Voice-Conversion-with-Diffusion-Models
Project on real time voice conversion with diffusion models |
|
Experimental |