NVlabs/ODISE
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
This project helps researchers and engineers analyze images by automatically outlining and classifying every object and region, even for categories it hasn't seen before. You provide an image and a text description of what you want to find, and it outputs a detailed, segmented image. It's designed for computer vision scientists and AI practitioners who need to perform advanced image analysis.
934 stars. No commits in the last 6 months.
Use this if you need to precisely segment and identify objects within images using flexible text descriptions, especially for categories not present in standard training datasets.
Not ideal if you're looking for a simple, off-the-shelf image classification tool for a fixed set of categories, or if you don't have expertise in machine learning development.
Stars
934
Forks
56
Language
Python
License
—
Category
Last pushed
Jul 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/NVlabs/ODISE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
bghira/SimpleTuner
A general fine-tuning kit geared toward image/video/audio diffusion models.
mcmonkeyprojects/SwarmUI
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an...
nateraw/stable-diffusion-videos
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
TheDesignFounder/DreamLayer
Benchmark diffusion models faster. Automate evals, seeds, and metrics for reproducible results.