guyyariv/AudioToken

This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

/ 100

Emerging

This project helps content creators and researchers generate images directly from audio recordings. You provide an audio clip, and the system creates a corresponding image based on the sound. This is ideal for artists, marketers, or researchers exploring new ways to visualize soundscapes or create multimedia content without needing descriptive text.

No commits in the last 6 months.

Use this if you need to generate visual content from sound, such as creating album art from music, visualizing sound events for research, or producing unique imagery for marketing campaigns based on audio clips.

Not ideal if you need precise control over image details or require images that are not conceptually linked to audio.

audio-visualization content-creation sound-design multimedia-art research-imaging

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

huggingface/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

bghira/SimpleTuner

A general fine-tuning kit geared toward image/video/audio diffusion models.

mcmonkeyprojects/SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an...

nateraw/stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

TheDesignFounder/DreamLayer

Benchmark diffusion models faster. Automate evals, seeds, and metrics for reproducible results.

Explore Diffusion Models

All categories Trending Diffusion directory Insights