haidog-yaqub/EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
This project helps sound designers, content creators, and educators quickly generate high-quality audio from text descriptions. You provide a text prompt describing the desired sound, and it outputs a corresponding audio file. It also supports editing existing audio, like replacing a section based on new text, or generating audio that matches a reference.
330 stars.
Use this if you need to create realistic sound effects, background audio, or short audio clips from text without recording or extensive sound design software.
Not ideal if you require precise musical composition, human speech generation, or extremely long-form audio content.
Stars
330
Forks
25
Language
Python
License
MIT
Category
Last pushed
Dec 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/haidog-yaqub/EzAudio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmagic
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄:...
jdh-algo/JoyVASA
Diffusion-based Portrait and Animal Animation
404-Repo/404-gen-blender-add-on
Blender add-on for 404-GEN 3D generator running on Bittensor
linzhiqiu/t2v_metrics
Evaluating text-to-image/video/3D models with VQAScore
TIGER-AI-Lab/AnyV2V
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" [TMLR 2024]