snorkel-team/snorkel
A system for quickly generating training data with weak supervision
Snorkel helps machine learning practitioners efficiently create labeled training data, which is often a bottleneck in developing AI applications. It allows you to programmatically define labeling functions that take your raw, unlabeled data as input and output training data with predicted labels. This is ideal for data scientists or ML engineers who need to quickly generate large datasets for model training without manual annotation.
5,940 stars. Used by 1 other package. No commits in the last 6 months. Available on PyPI.
Use this if you need to rapidly label a large dataset for machine learning model training, especially when manual labeling is too time-consuming or expensive.
Not ideal if you have a small dataset that can be easily labeled manually or if you are looking for an end-to-end ML platform that includes model development and deployment.
Stars
5,940
Forks
855
Language
Python
License
Apache-2.0
Category
Last pushed
May 02, 2024
Commits (30d)
0
Dependencies
10
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/snorkel-team/snorkel"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
cvat-ai/cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and...
HumanSignal/label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
wkentaro/labelme
Image annotation with Python. Supports polygon, rectangle, circle, line, point, and AI-assisted...
CVHub520/X-AnyLabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
doccano/doccano
Open source annotation tool for machine learning practitioners.