westlake-repl/MicroLens

A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Videos (Talk Invited by DeepMind).

31
/ 100
Emerging

This project provides an extensive collection of short video data, including raw video, audio, text captions, and user interaction details like views and likes. It's designed for researchers and data scientists who are building or evaluating new recommendation systems, especially for platforms featuring short-form video content. You input this rich dataset and use it to train and test recommendation algorithms, with the output being insights into how well your models predict user engagement.

257 stars. No commits in the last 6 months.

Use this if you are a researcher or data scientist focused on developing, benchmarking, or improving recommendation algorithms for short-video platforms and need a large, multimodal dataset with real user interaction data.

Not ideal if you are looking for a pre-built, production-ready recommendation system or if your primary interest is not in content-driven short-video recommendations.

short-video-recommendation recommender-systems-research user-behavior-analysis multimedia-content-analysis large-scale-datasets
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 13 / 25

How are scores calculated?

Stars

257

Forks

19

Language

Python

License

Last pushed

Jan 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/westlake-repl/MicroLens"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.