Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
This project helps audio creators and content producers generate high-quality audio clips from simple text descriptions. You input a text prompt like "Food sizzling with some knocking and banging followed by a dog barking" and get a 10-second audio file. This tool is ideal for sound designers, game developers, or marketers who need unique sound effects quickly without extensive audio engineering.
No commits in the last 6 months.
Use this if you need to rapidly create specific sound effects or ambient audio from text descriptions with significantly reduced generation time.
Not ideal if you require precise musical composition or human-like speech synthesis, as this tool focuses on sound effects and environmental audio.
Stars
39
Forks
—
Language
Python
License
MIT
Category
Last pushed
Nov 20, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/Bai-YT/ConsistencyTTA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster,...
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...