yuhui-zh15/drml
Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)
This project helps machine learning engineers and researchers understand why their computer vision models make mistakes. It takes an existing vision model and uses natural language descriptions to pinpoint specific types of images or situations where the model fails. The output identifies problematic data categories and suggests ways to fix these errors, all without needing to collect or label more visual data.
No commits in the last 6 months.
Use this if you need to quickly diagnose and understand the failure modes of your image classification models using natural language, rather than manually sifting through images.
Not ideal if your primary goal is to train a new vision model from scratch or if you don't have access to multi-modal language-vision embeddings.
Stars
34
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Jun 08, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/yuhui-zh15/drml"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
KaiyangZhou/pytorch-vsumm-reinforce
Unsupervised video summarization with deep reinforcement learning (AAAI'18)
adambielski/siamese-triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch