sony/soundctm
Pytorch implementation of SoundCTM
This project helps audio engineers, sound designers, and content creators generate realistic, full-band sound effects or audio clips from simple text descriptions. You provide a textual prompt (e.g., "a dog barking in a park") and it outputs a corresponding audio file. It's for anyone needing to create custom soundscapes or effects without needing to record or synthesize them manually.
101 stars. No commits in the last 6 months.
Use this if you need to quickly generate specific sound effects or ambient audio based on text descriptions for multimedia projects, games, or virtual environments.
Not ideal if you need fine-grained control over musical composition, voice synthesis, or extremely long, complex audio narratives.
Stars
101
Forks
10
Language
Python
License
MIT
Category
Last pushed
Mar 31, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/sony/soundctm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you to deliver faster,...
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync!
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...