princeton-nlp/CharXiv

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

/ 100

Emerging

This project provides an evaluation suite for researchers and developers working with Multimodal Large Language Models (MLLMs). It helps assess how well an MLLM can understand and answer questions about charts found in scientific papers. You input charts and questions, and it outputs a score reflecting the model's accuracy, revealing strengths and weaknesses in chart comprehension.

142 stars. No commits in the last 6 months.

Use this if you are a researcher or developer who wants to rigorously benchmark and improve the chart understanding capabilities of your Multimodal Large Language Models (MLLMs).

Not ideal if you are looking for a tool to generate charts, interpret chart data for business insights, or simply extract raw data points from images.

Multimodal AI LLM evaluation Chart analysis Scientific document processing Model benchmarking

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

142

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

kyegomez/RT-X

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment:...

kyegomez/PALI3

Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"

chuanyangjin/MMToM-QA

[🏆Outstanding Paper Award at ACL 2024] MMToM-QA: Multimodal Theory of Mind Question Answering

lyuchenyang/Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Muennighoff/vilio

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

Explore Transformer Models

All categories Trending Transformer directory Insights