shashankvkt/DoRA_ICLR24

This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video""

/ 100

Emerging

This project helps machine learning engineers and researchers create powerful image recognition systems without needing massive, manually labeled image datasets like ImageNet. It takes long, unlabelled videos as input and produces image encoder models capable of recognizing objects and patterns. This is ideal for those working on computer vision tasks where collecting and labeling large image datasets is impractical.

No commits in the last 6 months.

Use this if you need to train a robust image recognition model but only have access to large amounts of unlabelled video footage, rather than pre-classified image datasets.

Not ideal if you already have a well-curated, labelled image dataset or if your primary goal is real-time object tracking rather than general image understanding.

computer-vision machine-learning-engineering video-analytics unsupervised-learning representation-learning

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

Jittor/jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

berniwal/swin-transformer-pytorch

Implementation of the Swin Transformer in PyTorch.

zhanghang1989/ResNeSt

ResNeSt: Split-Attention Networks

NVlabs/FasterViT

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with...

ViTAE-Transformer/ViTPose

The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose...

Explore ML Frameworks

All categories Trending ML Framework directory Insights