visual-layer/fastdup

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

/ 100

Established

This tool helps data professionals quickly analyze large collections of images and videos to improve their quality. It takes your dataset of images and videos and identifies duplicates, outliers, mislabeled items, and low-quality files. The output helps machine learning engineers, data scientists, and anyone managing large visual datasets to clean and refine their data for better model performance or content management.

1,834 stars.

Use this if you need to rapidly detect and address quality issues like duplicates, mislabels, or low-quality content within massive image and video datasets.

Not ideal if you are working with small datasets or primarily text-based data, as its main strength is large-scale visual data analysis.

data-quality visual-data-management machine-learning-engineering computer-vision dataset-curation

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

1,834

Forks

Language

Python

License

—

Related frameworks

Cloud-CV/EvalAI

:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI

fireindark707/Python-Schema-Matching

A python tool using XGboost and sentence-transformers to perform schema matching task on tables.

graphbookai/graphbook

Visual AI development framework for training and inference of ML models, scaling pipelines, and...

github/CodeSearchNet

Datasets, tools, and benchmarks for representation learning of code.

tthtlc/awesome-source-analysis

Source code understanding via Machine Learning techniques

Explore ML Frameworks

All categories Trending ML Framework directory Insights