Shubha23/Text-processing-NLP

This notebook contains entire text preprocessing pipeline for NLP problems. The ready-to-use functions require NLTK and SKlearn package installations. It also contains some prominent text classification models.

/ 100

Emerging

This project helps data scientists and NLP practitioners quickly prepare text data for analysis. It takes raw text datasets, like those from customer feedback or social media, and transforms them through a series of preprocessing steps. The output is clean, structured text ready for building machine learning models for tasks such as classification.

Use this if you need a pre-built, standardized pipeline to clean and prepare text data for NLP applications without writing all the boilerplate code from scratch.

Not ideal if your dataset is purely numerical, or if you are working on non-text-based classification, regression, or clustering problems.

text-preprocessing NLP-pipeline data-cleaning text-classification machine-learning-preparation

No License No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

chartbeat-labs/textacy

NLP, before and after spaCy

nltk/nltk_data

NLTK Data

brightertiger/pygarble

Python Package to detect garbled, gibberish text for EN

jfilter/clean-text

🧹 Python package for text cleaning

prasanthg3/cleantext

An open-source package for python to clean raw text data

Explore NLP Tools

All categories Trending NLP directory Insights