visual-layer/fastdup

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

51
/ 100
Established

This tool helps data professionals quickly analyze large collections of images and videos to improve their quality. It takes your dataset of images and videos and identifies duplicates, outliers, mislabeled items, and low-quality files. The output helps machine learning engineers, data scientists, and anyone managing large visual datasets to clean and refine their data for better model performance or content management.

1,834 stars.

Use this if you need to rapidly detect and address quality issues like duplicates, mislabels, or low-quality content within massive image and video datasets.

Not ideal if you are working with small datasets or primarily text-based data, as its main strength is large-scale visual data analysis.

data-quality visual-data-management machine-learning-engineering computer-vision dataset-curation
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

1,834

Forks

87

Language

Python

License

Last pushed

Feb 18, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/visual-layer/fastdup"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.