Guang000/Awesome-Dataset-Distillation

A curated list of awesome papers on dataset distillation and related applications.

67
/ 100
Established

This project compiles a detailed list of research papers focused on 'dataset distillation'. It's a method for creating a much smaller, synthetic dataset that can be used to train AI models to perform almost as well as if they were trained on the original, much larger dataset. The primary users are machine learning researchers and practitioners who work with large datasets and need to reduce their size for efficiency or other applications.

1,909 stars. Actively maintained with 60 commits in the last 30 days.

Use this if you are a machine learning researcher or practitioner looking for comprehensive information and the latest advancements in the field of dataset distillation.

Not ideal if you are looking for a ready-to-use tool or code to perform dataset distillation without needing to delve into the research papers.

machine-learning-research data-miniaturization model-training-efficiency continual-learning data-privacy
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

1,909

Forks

170

Language

HTML

License

MIT

Last pushed

Mar 09, 2026

Commits (30d)

60

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Guang000/Awesome-Dataset-Distillation"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.