chakki-works/chariot

Deliver the ready-to-train data to your NLP model.

46
/ 100
Emerging

This tool helps data scientists and machine learning engineers prepare text data for training natural language processing (NLP) models. You provide raw text data, and it outputs cleaned, tokenized, and formatted numerical representations ready for model consumption. It's designed for individuals building and training NLP models who need to efficiently transform messy text into structured input.

122 stars. No commits in the last 6 months. Available on PyPI.

Use this if you are an NLP practitioner or data scientist who regularly prepares diverse text datasets for machine learning models and needs a streamlined, repeatable workflow.

Not ideal if you are looking for a pre-trained NLP model or a no-code solution for text analysis, as this tool focuses on the data preparation pipeline.

natural-language-processing machine-learning-engineering text-preprocessing data-preparation model-training
Stale 6m No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 11 / 25

How are scores calculated?

Stars

122

Forks

9

Language

Jupyter Notebook

License

Apache-2.0

Last pushed

Jul 15, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/chakki-works/chariot"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.