OpenGVLab/InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

47
/ 100
Emerging

This project offers a family of open-source multimodal large language models (MLLMs) that can understand and respond to both images and text. You can input various types of visual content, like photos or diagrams, along with your text questions or commands, and receive detailed, intelligent textual responses. It's designed for AI developers, researchers, and engineers who build applications requiring advanced visual and linguistic comprehension.

9,879 stars. No commits in the last 6 months.

Use this if you are developing AI applications that need to process and reason about visual information and text together, and you require high-performing, open-source models comparable to leading commercial alternatives.

Not ideal if you are looking for a pre-built, consumer-ready application rather than a foundational model suite for development.

AI development multimodal AI large language models computer vision natural language processing
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

9,879

Forks

764

Language

Python

License

MIT

Last pushed

Sep 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/OpenGVLab/InternVL"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.