visual-layer/fastdup
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
This tool helps data professionals quickly analyze large collections of images and videos to improve their quality. It takes your dataset of images and videos and identifies duplicates, outliers, mislabeled items, and low-quality files. The output helps machine learning engineers, data scientists, and anyone managing large visual datasets to clean and refine their data for better model performance or content management.
1,834 stars.
Use this if you need to rapidly detect and address quality issues like duplicates, mislabels, or low-quality content within massive image and video datasets.
Not ideal if you are working with small datasets or primarily text-based data, as its main strength is large-scale visual data analysis.
Stars
1,834
Forks
87
Language
Python
License
—
Category
Last pushed
Feb 18, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/visual-layer/fastdup"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Cloud-CV/EvalAI
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
fireindark707/Python-Schema-Matching
A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
graphbookai/graphbook
Visual AI development framework for training and inference of ML models, scaling pipelines, and...
github/CodeSearchNet
Datasets, tools, and benchmarks for representation learning of code.
tthtlc/awesome-source-analysis
Source code understanding via Machine Learning techniques