haoheliu/AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

/ 100

Emerging

This project helps machine learning researchers and audio engineers develop and customize AI models that generate audio from text descriptions. You input text prompts and an audio dataset, and it outputs a trained AI model capable of generating new audio, along with evaluations of its performance. This is for users who want to build custom audio generation capabilities for specialized applications.

297 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher or audio AI developer needing to train or fine-tune generative AI models for converting text to unique audio, using your own datasets or existing models.

Not ideal if you simply want to generate audio from text without diving into model training or customization, as dedicated tools like AudioLDM and AudioLDM2 exist for direct inference.

audio-generation machine-learning-research audio-ai-development sound-synthesis generative-ai

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

297

Forks

Language

Python

License

MIT

Higher-rated alternatives

PrunaAI/pruna

Pruna is a model optimization framework built for developers, enabling you to deliver faster,...

bytedance/LatentSync

Taming Stable Diffusion for Lip Sync!

Text-to-Audio/Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

teticio/audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead...

ivanvovk/WaveGrad

Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.

Explore Diffusion Models

All categories Trending Diffusion directory Insights