Shubha23/Text-processing-NLP
This notebook contains entire text preprocessing pipeline for NLP problems. The ready-to-use functions require NLTK and SKlearn package installations. It also contains some prominent text classification models.
This project helps data scientists and NLP practitioners quickly prepare text data for analysis. It takes raw text datasets, like those from customer feedback or social media, and transforms them through a series of preprocessing steps. The output is clean, structured text ready for building machine learning models for tasks such as classification.
Use this if you need a pre-built, standardized pipeline to clean and prepare text data for NLP applications without writing all the boilerplate code from scratch.
Not ideal if your dataset is purely numerical, or if you are working on non-text-based classification, regression, or clustering problems.
Stars
15
Forks
8
Language
Jupyter Notebook
License
—
Category
Last pushed
Dec 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Shubha23/Text-processing-NLP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
chartbeat-labs/textacy
NLP, before and after spaCy
nltk/nltk_data
NLTK Data
brightertiger/pygarble
Python Package to detect garbled, gibberish text for EN
jfilter/clean-text
🧹 Python package for text cleaning
prasanthg3/cleantext
An open-source package for python to clean raw text data