lyeoni/prenlp

Preprocessing Library for Natural Language Processing

47
/ 100
Emerging

This tool helps data scientists and NLP practitioners prepare raw text for analysis or machine learning. It takes uncleaned text data (like social media posts, articles, or reviews) and converts it into a standardized, tokenized format that's ready for tasks like sentiment analysis or language modeling. It also includes popular English and Korean datasets for common NLP benchmarks.

164 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to quickly clean and tokenize text data for natural language processing tasks, especially if you're working with English or Korean content.

Not ideal if your primary need is advanced linguistic analysis or if your data requires highly specialized, domain-specific preprocessing rules not covered by common normalization.

text-mining sentiment-analysis language-modeling korean-nlp data-preparation
Stale 6m
Maintenance 0 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 12 / 25

How are scores calculated?

Stars

164

Forks

12

Language

Python

License

Apache-2.0

Last pushed

Dec 06, 2022

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/lyeoni/prenlp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.