chakki-works/chariot

Deliver the ready-to-train data to your NLP model.

/ 100

Emerging

This tool helps data scientists and machine learning engineers prepare text data for training natural language processing (NLP) models. You provide raw text data, and it outputs cleaned, tokenized, and formatted numerical representations ready for model consumption. It's designed for individuals building and training NLP models who need to efficiently transform messy text into structured input.

122 stars. No commits in the last 6 months. Available on PyPI.

Use this if you are an NLP practitioner or data scientist who regularly prepares diverse text datasets for machine learning models and needs a streamlined, repeatable workflow.

Not ideal if you are looking for a pre-trained NLP model or a no-code solution for text analysis, as this tool focuses on the data preparation pipeline.

natural-language-processing machine-learning-engineering text-preprocessing data-preparation model-training

Stale 6m No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 11 / 25

How are scores calculated?

Stars

122

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

natasha/ipymarkup

NER, syntax markup visualizations

neomatrix369/nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a...

thepushkarp/nalcos

Search Git commits in natural language

lyeoni/nlp-tutorial

A list of NLP(Natural Language Processing) tutorials

NirantK/NLP_Quickbook

NLP in Python with Deep Learning

Explore NLP Tools

All categories Trending NLP directory Insights