baaivision/EVE
EVE Series: Encoder-Free Vision-Language Models from BAAI
EVE provides advanced research in Vision-Language Models (VLMs) by exploring methods to remove the traditional vision encoder. It takes image data and text as input and aims to produce highly capable multimodal AI models. This project is for AI researchers and practitioners focused on developing or understanding next-generation vision-language AI.
368 stars. No commits in the last 6 months.
Use this if you are an AI researcher investigating novel architectures for multimodal large language models, specifically interested in encoder-free designs.
Not ideal if you are a business user looking for a ready-to-deploy tool or an application developer seeking a stable API for immediate integration into a product.
Stars
368
Forks
12
Language
Python
License
MIT
Category
Last pushed
Jul 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/baaivision/EVE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...
om-ai-lab/VLM-R1
Solve Visual Understanding with Reinforced VLMs
bytedance/SALMONN
SALMONN family: A suite of advanced multi-modal LLMs
NVlabs/OmniVinci
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
fixie-ai/ultravox
A fast multimodal LLM for real-time voice