ningchaoar/UnsupervisedTextClassification

基于关键词的无监督文本分类;Implementation for paper "Text Classification by Bootstrapping with Keywords, EM and Shrinkage" http://www.cs.cmu.edu/~knigam/papers/keywordcat-aclws99.pdf

40
/ 100
Emerging

This project helps content managers, researchers, or anyone dealing with large collections of unlabeled text quickly sort them into predefined categories. You provide a list of texts and some initial keywords for each category, and it outputs a classification for each text. This is designed for users who need to understand the distribution of their text data or set up rules for more precise, subsequent labeling.

No commits in the last 6 months.

Use this if you have a massive amount of unclassified text and want to quickly organize it using a few descriptive keywords per category.

Not ideal if your categories are ambiguous or overlap significantly, as it relies on distinct keywords for effective classification.

content-categorization text-analysis data-organization information-retrieval document-management
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

28

Forks

8

Language

Python

License

MIT

Last pushed

Jan 28, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ningchaoar/UnsupervisedTextClassification"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.