p-karisani/self_pretraining
A classification model
This project helps classify text documents into two categories (e.g., positive/negative sentiment, spam/not-spam) even when you have very few labeled examples but many unlabeled ones. You provide a list of documents, some with categories and many without, and it outputs a highly accurate classification model ready to categorize new documents. It's designed for data scientists or researchers who need to categorize large volumes of text efficiently.
No commits in the last 6 months.
Use this if you need to classify text but have limited labeled data and a large pool of unlabeled text that could help train a better model.
Not ideal if you have abundant labeled data for your text classification task or if your documents require multi-label or multi-class classification beyond a simple binary choice.
Stars
21
Forks
3
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 24, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/p-karisani/self_pretraining"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
giacbrd/ShallowLearn
An experiment about re-implementing supervised learning models based on shallow neural network...
javedsha/text-classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Wluper/edm
Python package for understanding the difficulty of text classification datasets. (in CoNNL 2018)
fendouai/Awesome-Text-Classification
Awesome-Text-Classification Projects,Papers,Tutorial .
chicago-justice-project/article-tagging
Natural Language Processing of Chicago news articles