SMSD75/Timetuning

Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations ICCV23

/ 100

Experimental

This project helps computer vision researchers and practitioners enhance how their AI models understand objects within images. By training a Vision Transformer on short video clips showing objects in motion, it improves the model's ability to identify and segment those objects in new images and videos. The input is a pre-trained Vision Transformer and videos of target objects, and the output is a more robust Vision Transformer capable of creating semantically rich image patch embeddings for better object segmentation.

No commits in the last 6 months.

Use this if you need to improve the spatial understanding and segmentation capabilities of your Vision Transformer models by leveraging temporal information from videos.

Not ideal if your primary task is not image or video object segmentation, or if you do not have video data readily available for training.

object-segmentation video-analysis image-processing computer-vision machine-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

AdaptiveMotorControlLab/CEBRA

Learnable latent embeddings for joint behavioral and neural analysis - Official implementation of CEBRA

theolepage/sslsv

Toolkit for training and evaluating Self-Supervised Learning (SSL) frameworks for Speaker...

PaddlePaddle/PASSL

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision...

YGZWQZD/LAMDA-SSL

30 Semi-Supervised Learning Algorithms

ModSSC/ModSSC

ModSSC: A Modular Framework for Semi Supervised Classification

Explore ML Frameworks

All categories Trending ML Framework directory Insights