EvolvingLMMs-Lab/lmms-eval

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

78
/ 100
Verified

This tool helps researchers and AI practitioners reliably compare how well different multimodal AI models understand and respond to various types of real-world information. You provide an AI model and a set of diverse tasks involving text, images, video, and audio, and it outputs consistent, trustworthy performance metrics. Anyone who builds, deploys, or studies large multimodal models will find this useful for understanding model capabilities.

3,883 stars. Used by 1 other package. Actively maintained with 25 commits in the last 30 days. Available on PyPI.

Use this if you need to rigorously and reproducibly evaluate the performance of multimodal AI models across a wide range of tasks involving different data types.

Not ideal if you are looking for a simple, single-metric benchmark for a single data type or if you are not working with advanced AI models.

AI model evaluation multimodal AI machine learning research AI development model benchmarking
Maintenance 20 / 25
Adoption 11 / 25
Maturity 25 / 25
Community 22 / 25

How are scores calculated?

Stars

3,883

Forks

539

Language

Python

License

Last pushed

Mar 11, 2026

Commits (30d)

25

Dependencies

52

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/EvolvingLMMs-Lab/lmms-eval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.