kyegomez/VisionLLaMA

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

/ 100

Experimental

This project helps machine learning engineers and researchers explore and implement advanced vision models. It takes image data as input and processes it through a LLaMA-like architecture to produce outputs for various computer vision tasks, such as image classification. This is primarily used by AI/ML practitioners focused on cutting-edge model development.

No commits in the last 6 months.

Use this if you are an AI/ML engineer or researcher working with PyTorch and Zeta, and you want to experiment with a unified LLaMA interface for vision tasks as described in the VisionLLaMA paper.

Not ideal if you are looking for a plug-and-play solution for common computer vision problems without deep involvement in model architecture or framework-level development.

Computer Vision Deep Learning Research Model Architecture Image Classification AI Development

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

open-mmlab/mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

facebookresearch/mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis

Papers, code and datasets about deep learning and multi-modal learning for video analysis

KaiyangZhou/pytorch-vsumm-reinforce

Unsupervised video summarization with deep reinforcement learning (AAAI'18)

adambielski/siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

Explore ML Frameworks

All categories Trending ML Framework directory Insights