agentic-learning-ai-lab/lifelong-memory

Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos

/ 100

Experimental

This tool helps you quickly find specific moments and answer questions about actions captured in long, first-person video recordings. You provide your egocentric video footage (like from a bodycam) and your questions in natural language, and it outputs precise answers or timestamps for relevant events. It's designed for researchers or analysts who need to efficiently review extensive subjective video data.

Use this if you need to extract specific information or answer questions from many hours of first-person video content.

Not ideal if your videos are not egocentric (first-person perspective) or if you need to process short-form, general video content.

video-analysis first-person-video activity-recognition qualitative-research behavioral-studies

No Package No Dependents

Maintenance 6 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming...

om-ai-lab/VLM-R1

Solve Visual Understanding with Reinforced VLMs

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

NVlabs/OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Explore Transformer Models

All categories Trending Transformer directory Insights