chuanyangjin/MMToM-QA

[🏆Outstanding Paper Award at ACL 2024] MMToM-QA: Multimodal Theory of Mind Question Answering

/ 100

Emerging

This project helps researchers evaluate AI's ability to understand human-like minds, a concept known as Theory of Mind. It processes videos and text descriptions of everyday scenarios, then determines if an AI can correctly infer an agent's goals and beliefs based on their actions. Scientists in AI research or cognitive science focusing on machine intelligence and human-computer interaction would use this to benchmark advanced AI models.

154 stars.

Use this if you are an AI researcher or cognitive scientist developing or testing AI models that need to understand and predict human intentions and beliefs in complex, real-world interactions.

Not ideal if you need a pre-trained, ready-to-deploy AI for practical applications like customer service bots or predictive analytics, as this is a research benchmark.

AI-evaluation cognitive-science human-like-AI multimodal-reasoning computational-psychology

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

154

Forks

Language

Python

License

MIT

Higher-rated alternatives

kyegomez/RT-X

Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment:...

kyegomez/PALI3

Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"

lyuchenyang/Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Muennighoff/vilio

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

kyegomez/PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

Explore Transformer Models

All categories Trending Transformer directory Insights