HaoyuanYang-2023/ImagineFSL
Official implementation of "ImagineFSL: Self-Supervised Pretraining Matters on Imagined Base Set for VLM-based Few-shot Learning" [CVPR 2025 Highlight]
ImagineFSL helps computer vision researchers and practitioners efficiently train image classification models, especially when only a few real-world examples are available. It takes descriptive text (captions) and a small set of real images as input, then generates a large synthetic dataset and uses it to pre-train a vision model. The output is a robust image classification model capable of accurately categorizing images even with limited real-world data.
No commits in the last 6 months.
Use this if you need to build accurate image classification models for new categories or tasks but have very few real training images and want to leverage large language models and image generation for data augmentation.
Not ideal if you already have extensive real-world datasets for your classification task or if your primary goal is not image classification.
Stars
26
Forks
—
Language
Python
License
—
Category
Last pushed
Sep 01, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/HaoyuanYang-2023/ImagineFSL"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
EndlessSora/focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
JIA-Lab-research/DreamOmni2
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...
SkyworkAI/UniPic
Open-source SOTA multi-image editing model