benedekrozemberczki/datasets

A repository of pretty cool datasets that I collected for network science and machine learning research.

51
/ 100
Established

This collection provides various social network datasets derived from platforms like Twitch, LastFM, Deezer, and GitHub. You can use this data, which includes user connections and sometimes user attributes, to analyze social structures and predict user behaviors like language, churn, or gender. It's ideal for data scientists, machine learning researchers, or social scientists working with graph-based analysis.

651 stars.

Use this if you need pre-collected, real-world social network graphs for tasks like predicting user characteristics, understanding community structures, or evaluating graph-based machine learning models.

Not ideal if you need continuously updated, real-time data or highly specialized datasets not focused on social network interactions.

social-network-analysis user-behavior-prediction community-detection graph-modeling data-science-research
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

651

Forks

83

Language

License

MIT

Last pushed

Dec 20, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/benedekrozemberczki/datasets"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.