HaoyuanYang-2023/ImagineFSL

Official implementation of "ImagineFSL: Self-Supervised Pretraining Matters on Imagined Base Set for VLM-based Few-shot Learning" [CVPR 2025 Highlight]

/ 100

Experimental

ImagineFSL helps computer vision researchers and practitioners efficiently train image classification models, especially when only a few real-world examples are available. It takes descriptive text (captions) and a small set of real images as input, then generates a large synthetic dataset and uses it to pre-train a vision model. The output is a robust image classification model capable of accurately categorizing images even with limited real-world data.

No commits in the last 6 months.

Use this if you need to build accurate image classification models for new categories or tasks but have very few real training images and want to leverage large language models and image generation for data augmentation.

Not ideal if you already have extensive real-world datasets for your classification task or if your primary goal is not image classification.

computer-vision image-classification data-synthesis model-training few-shot-learning

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

VectorSpaceLab/OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

EndlessSora/focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

JIA-Lab-research/DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...

SkyworkAI/UniPic

Open-source SOTA multi-image editing model

Explore Diffusion Models

All categories Trending Diffusion directory Insights