jfilter/clean-text
🧹 Python package for text cleaning
This tool helps anyone working with user-generated content, like social media posts or scraped web data, to clean up messy text. It takes raw, potentially garbled input with strange characters, broken formatting, and unwanted elements, and transforms it into a normalized, readable format. This is ideal for data analysts, researchers, and content managers who need consistent text for further analysis or presentation.
1,004 stars.
Use this if you need to reliably clean and standardize unstructured text data from various online sources before analysis or processing.
Not ideal if your text data is already perfectly clean or if you need highly specialized, domain-specific linguistic parsing beyond general normalization.
Stars
1,004
Forks
81
Language
Python
License
—
Category
Last pushed
Jan 28, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jfilter/clean-text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
chartbeat-labs/textacy
NLP, before and after spaCy
nltk/nltk_data
NLTK Data
brightertiger/pygarble
Python Package to detect garbled, gibberish text for EN
prasanthg3/cleantext
An open-source package for python to clean raw text data
alinapetukhova/textcl
Text preprocessing package for use in NLP tasks https://pypi.org/project/textcl/