Ahren09/SciEvo

A longitudinal dataset for academic literature, including papers, metadata, and citation graphs, Also available on 🤗 HuggingFace and Kaggle

29
/ 100
Experimental

This dataset helps researchers and academics analyze the evolution of scientific knowledge over 30 years. It provides a vast collection of over two million academic papers, including their titles, abstracts, publication dates, authors, and detailed citation networks. Researchers in fields like scientometrics and library science can use this to study long-term trends, citation practices, and knowledge exchange across disciplines.

No commits in the last 6 months.

Use this if you need a comprehensive, pre-processed dataset of academic literature from arXiv, complete with rich metadata and citation graphs, to study trends in research fields.

Not ideal if you only need data from a very specific, niche academic journal not covered by arXiv, or if you require real-time updates on newly published papers.

scientometrics academic-research citation-analysis research-trends knowledge-evolution
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

17

Forks

1

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Sep 06, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/Ahren09/SciEvo"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.