sony/soundctm

Pytorch implementation of SoundCTM

/ 100

Emerging

This project helps audio engineers, sound designers, and content creators generate realistic, full-band sound effects or audio clips from simple text descriptions. You provide a textual prompt (e.g., "a dog barking in a park") and it outputs a corresponding audio file. It's for anyone needing to create custom soundscapes or effects without needing to record or synthesize them manually.

101 stars. No commits in the last 6 months.

Use this if you need to quickly generate specific sound effects or ambient audio based on text descriptions for multimedia projects, games, or virtual environments.

Not ideal if you need fine-grained control over musical composition, voice synthesis, or extremely long, complex audio narratives.

sound design audio generation content creation multimedia production game audio

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

101

Forks

Language

Python

License

MIT

Higher-rated alternatives

PrunaAI/pruna

Pruna is a model optimization framework built for developers, enabling you to deliver faster,...

bytedance/LatentSync

Taming Stable Diffusion for Lip Sync!

haoheliu/AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

Text-to-Audio/Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

teticio/audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...

Explore Diffusion Models

All categories Trending Diffusion directory Insights