princeton-nlp/CharXiv

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

41
/ 100
Emerging

This project provides an evaluation suite for researchers and developers working with Multimodal Large Language Models (MLLMs). It helps assess how well an MLLM can understand and answer questions about charts found in scientific papers. You input charts and questions, and it outputs a score reflecting the model's accuracy, revealing strengths and weaknesses in chart comprehension.

142 stars. No commits in the last 6 months.

Use this if you are a researcher or developer who wants to rigorously benchmark and improve the chart understanding capabilities of your Multimodal Large Language Models (MLLMs).

Not ideal if you are looking for a tool to generate charts, interpret chart data for business insights, or simply extract raw data points from images.

Multimodal AI LLM evaluation Chart analysis Scientific document processing Model benchmarking
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

142

Forks

15

Language

Python

License

Apache-2.0

Last pushed

Apr 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/princeton-nlp/CharXiv"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.