worldbench/DriveBench
[ICCV 2025] Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
DriveBench is a dataset for evaluating how well Vision-Language Models (VLMs) understand complex driving scenarios. It takes in images and text-based questions about driving situations and outputs answers that reveal if the VLM truly understands the visual context, especially under challenging conditions. Autonomous driving researchers and engineers can use this to rigorously test and improve the reliability of their VLM-powered systems.
232 stars.
Use this if you are developing or evaluating AI systems for autonomous vehicles and need to rigorously test how reliably Vision-Language Models interpret driving scenes under various conditions, including degraded visual input.
Not ideal if you are looking for a dataset to train general-purpose Vision-Language Models outside of the autonomous driving domain.
Stars
232
Forks
15
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 12, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/worldbench/DriveBench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
worldbench/awesome-vla-for-ad
🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
zjukg/KG-MM-Survey
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
worldbench/awesome-spatial-intelligence
🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems