SkyworkAI/Skywork-R1V

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.

/ 100

Established

This project offers powerful AI models that can understand and reason about images, combined with text descriptions. You input an image and a question or instruction, and the model provides a detailed answer or performs a complex task like code execution or deep research. It's ideal for analysts, researchers, or anyone needing to extract insights and perform sophisticated analysis from visual information.

3,158 stars.

Use this if you need to analyze images and text together, performing complex reasoning, code execution, or in-depth research based on visual inputs.

Not ideal if your primary need is simple image recognition or if you require an offline, on-device solution without API access.

image-analysis visual-reasoning research-automation data-extraction content-understanding

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

3,158

Forks

279

Language

Python

License

MIT

Related tools

jingyaogong/minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

roboflow/vision-ai-checkup

Take your LLM to the optometrist.

zai-org/GLM-TTS

GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning

NExT-GPT/NExT-GPT

Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model

EvolvingLMMs-Lab/NEO

NEO Series: Native Vision-Language Models from First Principles

Explore LLM Tools

All categories Trending LLM Tool directory Insights