code-kern-ai/refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
This tool helps data scientists prepare, manage, and improve the quality of text-based training data for Natural Language Processing (NLP) models. You input raw, unstructured text data (like customer feedback or articles) and get out clean, structured, and expertly labeled datasets. It's designed for data scientists building and refining NLP models who need to ensure their training data is high quality and consistently maintained.
1,470 stars. No commits in the last 6 months.
Use this if you need to efficiently label, assess, and maintain natural language training data to build or improve your NLP models, especially if your current data is unstructured or its quality is uncertain.
Not ideal if your project doesn't involve natural language processing or if you already have perfectly clean, perfectly labeled datasets that require no further management.
Stars
1,470
Forks
74
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 09, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/code-kern-ai/refinery"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
nus-cs3244-ml-singapore-7/sg-parliament-hansard-nlp-demo
Singapore Hansard NLP Demo
jaychampaneri14/ai-essay-grader
Automated essay scoring with BERT and linguistic features
ininando/AI-Answer-Evaluation-System
Evaluate and grade student answers in text and audio formats using advanced NLP for meaningful...