Ankur3107/nlp_preprocessing

Text Preprocessing Package includes cleaning, tokenization, dataset preparation ...etc

30
/ 100
Emerging

This tool helps data scientists and NLP practitioners prepare raw text for analysis. It takes unstructured text data, cleans it by removing noise and standardizing formats, and then structures it into datasets ready for machine learning models. The output is cleaned text, tokenized sequences, and processed datasets suitable for training.

No commits in the last 6 months.

Use this if you are a data scientist or NLP engineer needing to clean and prepare text data before feeding it into machine learning algorithms or for further linguistic analysis.

Not ideal if you are looking for a complete end-to-end machine learning solution or require advanced, domain-specific NLP models out-of-the-box.

text-analysis data-preparation natural-language-processing machine-learning-engineering data-cleaning
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

18

Forks

7

Language

JavaScript

License

Last pushed

Aug 16, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Ankur3107/nlp_preprocessing"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.