JingbiaoMei/ATM-Bench

ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features referential queries, evidence-grounded answering, and multi-source reasoning. Paper: "According to Me: Long-Term Personalized Referential Memory QA"

/ 100

Experimental

This project offers a benchmark for evaluating how well AI systems can recall specific, personalized details from a person's digital history. It takes in a mix of personal images, videos, and emails spanning several years, and lets you ask complex questions that require the AI to find and connect information from these various sources. It's designed for researchers and developers working on AI agents that need a long-term, multimodal understanding of an individual's past interactions and experiences.

Use this if you are developing or evaluating AI systems that need to answer personalized questions based on a large, diverse collection of a user's past memories and digital content.

Not ideal if your AI application doesn't require multimodal data, long-term memory recall, or referential questioning based on personal history.

AI-memory-research personal-assistant-development multimodal-AI knowledge-retrieval personalized-AI

No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 11 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Featured in

Agent Memory in 2026: What Actually Works for Persistent AI We Audited crewAI's AI Dependencies: Here's What the Data Says

Higher-rated alternatives

MemoriLabs/Memori

SQL Native Memory Layer for LLMs, AI Agents & Multi-Agent Systems

volcengine/OpenViking

OpenViking is an open-source context database designed specifically for AI Agents(such as...

mem0ai/mem0

Universal memory layer for AI Agents

zjunlp/LightMem

[ICLR 2026] LightMem: Lightweight and Efficient Memory-Augmented Generation

MemTensor/MemOS

AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill...

Explore RAG Tools

All categories Trending RAG directory Insights