LaurentVeyssier/Topic-Modeling-and-Document-Categorization-using-Latent-Dirichlet-Allocation
Categorize documents per topics inferred by LDA algorithm
This project helps you sort through large collections of text, like news headlines or social media posts, to automatically discover the main themes or topics present. You input a large group of unstructured text documents, and it outputs a breakdown of the hidden topics within them, along with which documents relate to which topics. Anyone who needs to understand the overarching subjects in a vast amount of text data, such as market researchers analyzing customer feedback or journalists sifting through archives, would find this useful.
No commits in the last 6 months.
Use this if you have a large collection of text documents and want to automatically uncover the main, underlying themes without pre-defining categories.
Not ideal if you need to classify documents into very specific, pre-defined categories or require highly precise, fine-grained distinctions between document types.
Stars
8
Forks
2
Language
Jupyter Notebook
License
—
Category
Last pushed
Jan 23, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/LaurentVeyssier/Topic-Modeling-and-Document-Categorization-using-Latent-Dirichlet-Allocation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models...
i-dot-ai/themefinder
A topic modelling Python package for analysing one-to-many question-answer data.
andifunke/topic-labeling
The project proposes a framework to apply topic models on a text-corpus and eventually topic...
bab2min/tomotopy
Python package of Tomoto, the Topic Modeling Tool
bobxwu/TopMost
A Topic Modeling System Toolkit (ACL 2024 Demo)