fmyblack/textClassify
此文本分类项目主要面向机器学习初学者和文本分类效果测试者,项目内部含有朴素贝叶斯,余弦定理,逻辑回归多种分类算法以及mm,rmm分词器,同时从某新闻站点爬取了多个分类共6000多篇文章,以及一个中文词典。项目方便自由拓展各种分类器和分词器,并通过组装测试分类效果。
This project helps you experiment with different algorithms for automatically sorting news articles into categories. You provide a collection of Chinese news articles and it uses various text analysis and machine learning methods to predict their categories. This tool is designed for students learning about machine learning and practitioners who need to quickly compare how different text classification techniques perform.
No commits in the last 6 months.
Use this if you are a student or researcher wanting to understand and compare the accuracy of various text classification algorithms on a real-world Chinese news dataset without needing complex external libraries.
Not ideal if you need a production-ready, highly optimized, or graphically rich solution for classifying large volumes of text, or if you prefer using established machine learning frameworks.
Stars
37
Forks
17
Language
Java
License
—
Category
Last pushed
Sep 29, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/fmyblack/textClassify"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rosette-api/java
Babel Street Analytics Client Library for Java
kermitt2/entity-fishing
A machine learning tool for fishing entities
vinhkhuc/JFastText
Java interface for fastText
CeON/CERMINE
Content ExtRactor and MINEr
vinhkhuc/jcrfsuite
Java interface for CRFsuite: http://www.chokkan.org/software/crfsuite/