xu-shitong/diffusion-image-captioning

implementation of paper https://arxiv.org/abs/2210.04559

/ 100

Emerging

This project helps researchers and academics in AI and machine learning explore a novel approach to image captioning. It takes an image as input and generates a descriptive text caption by leveraging diffusion models, which are typically used for image generation. The primary users are researchers interested in state-of-the-art text generation techniques for vision tasks.

Use this if you are an AI researcher experimenting with diffusion models for text generation or looking for alternative methods to traditional autoregressive models for image captioning.

Not ideal if you need a production-ready image captioning system for immediate use or are not comfortable with setting up research-grade code and datasets.

AI Research Image Captioning Natural Language Processing Computer Vision Diffusion Models

No License No Package No Dependents

Maintenance 6 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

huggingface/diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

bghira/SimpleTuner

A general fine-tuning kit geared toward image/video/audio diffusion models.

mcmonkeyprojects/SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an...

nateraw/stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

TheDesignFounder/DreamLayer

Benchmark diffusion models faster. Automate evals, seeds, and metrics for reproducible results.

Explore Diffusion Models

All categories Trending Diffusion directory Insights