happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
Auffusion helps you create realistic audio — including human, animal, natural, and artificial sounds, plus sound effects — from simple text descriptions. You provide a text prompt like "Birds singing sweetly in a blooming garden," and it generates a corresponding audio file. This tool is ideal for content creators, game designers, or anyone needing custom sound design without specialized audio recording equipment or expertise.
193 stars. No commits in the last 6 months.
Use this if you need to generate specific sound effects or ambient audio for creative projects using only text descriptions.
Not ideal if you need to generate music, speech, or require extremely precise control over nuanced musical or vocal performances.
Stars
193
Forks
13
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 25, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/happylittlecat2333/Auffusion"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ljleb/sd-mecha
Executable State Dict Recipes
SJTU-DENG-Lab/Discrete-Diffusion-Forcing
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Li-Jinsong/DAEDAL
[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...
SalesforceAIResearch/CoDA
Salesforce AI Research's open diffusion language model