westlake-repl/MicroLens
A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Videos (Talk Invited by DeepMind).
This project provides an extensive collection of short video data, including raw video, audio, text captions, and user interaction details like views and likes. It's designed for researchers and data scientists who are building or evaluating new recommendation systems, especially for platforms featuring short-form video content. You input this rich dataset and use it to train and test recommendation algorithms, with the output being insights into how well your models predict user engagement.
257 stars. No commits in the last 6 months.
Use this if you are a researcher or data scientist focused on developing, benchmarking, or improving recommendation algorithms for short-video platforms and need a large, multimodal dataset with real user interaction data.
Not ideal if you are looking for a pre-built, production-ready recommendation system or if your primary interest is not in content-driven short-video recommendations.
Stars
257
Forks
19
Language
Python
License
—
Category
Last pushed
Jan 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/westlake-repl/MicroLens"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chrisliu298/awesome-llm-unlearning
A resource repository for machine unlearning in large language models
worldbench/awesome-vla-for-ad
🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
zjukg/KG-MM-Survey
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
worldbench/awesome-spatial-intelligence
🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems