linsun449/cliper.code
This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation
This tool helps researchers and computer vision engineers automatically identify and outline various objects within images, even if those objects weren't specifically trained beforehand. You provide an image, and it outputs a detailed segmentation mask that precisely delineates different elements like cars, people, or even specific textures within the scene. It's designed for those working with large image datasets who need flexible, precise object recognition.
No commits in the last 6 months.
Use this if you need to precisely segment a wide variety of objects and regions within images without having to retrain your models for every new category.
Not ideal if you primarily work with a fixed, small set of object categories and require extremely fast processing for real-time applications where every millisecond counts.
Stars
40
Forks
—
Language
Python
License
—
Category
Last pushed
Sep 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/linsun449/cliper.code"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVlabs/Sana
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
FoundationVision/VAR
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈]...
nerdyrodent/VQGAN-CLIP
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
huggingface/finetrainers
Scalable and memory-optimized training of diffusion models
AssemblyAI-Community/MinImagen
MinImagen: A minimal implementation of the Imagen text-to-image model