ParadoxZW/LLaVA-UHD-Better

A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo

31
/ 100
Emerging

This project helps AI researchers and developers working with multimodal large language models (LMMs) that process high-resolution images. It takes raw image data and text as input, specifically for LLaVA-UHD models, and produces a more robust and accurately trained LMM without common errors present in the original implementation. This is ideal for those actively training and refining LMMs for image understanding tasks.

No commits in the last 6 months.

Use this if you are a researcher or developer actively training LLaVA-UHD models and need a reliable, bug-fixed implementation for better performance and accurate results.

Not ideal if you are looking for a pre-trained, ready-to-use LMM for inference without needing to engage in model training or architectural modifications.

AI research multimodal models large language models image recognition deep learning development
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

35

Forks

3

Language

Python

License

Apache-2.0

Last pushed

Aug 12, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ParadoxZW/LLaVA-UHD-Better"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.