yuzhimanhua/HIMECat
Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)
This tool helps researchers, content managers, or product managers automatically categorize documents into a multi-level hierarchy, even when you only have a few examples for each category. It takes your documents, which include text and associated metadata (like authors, tags, or product IDs), and outputs hierarchical category assignments for each document. This is ideal for anyone dealing with large collections of text that need structured classification.
No commits in the last 6 months.
Use this if you need to organize a large collection of documents into a hierarchical category system but have limited manually labeled data to train a classifier.
Not ideal if your documents lack any structured metadata or if you only need a flat (non-hierarchical) categorization.
Stars
45
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 02, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/yuzhimanhua/HIMECat"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kk7nc/HDLTex
HDLTex: Hierarchical Deep Learning for Text Classification
richliao/textClassifier
Text classifier for Hierarchical Attention Networks for Document Classification
RandolphVI/Hierarchical-Multi-Label-Text-Classification
The code of CIKM'19 paper《Hierarchical Multi-label Text Classification: An Attention-based...
yumeng5/LOTClass
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
sgrvinod/a-PyTorch-Tutorial-to-Text-Classification
Hierarchical Attention Networks | a PyTorch Tutorial to Text Classification