lejon/PartiallyCollapsedLDA
Implementations of various fast parallelized samplers for LDA, including Partially Collapsed LDA, Light LDA, Partially Collapsed Light LDA and a very efficient Polya-Urn LDA
This tool helps researchers and data analysts understand large collections of text documents by identifying underlying themes or topics. You provide a dataset of texts, and it outputs statistical models of topics, showing which words are associated with each topic. This allows you to discover hidden structures and meaning within unstructured text, often used by social scientists, market researchers, or anyone working with extensive document archives.
No commits in the last 6 months.
Use this if you need to extract and analyze recurring themes or subjects from a large corpus of text documents efficiently and with advanced statistical controls.
Not ideal if you're looking for a simple, out-of-the-box solution without needing to configure advanced parameters or if you have only a small number of documents.
Stars
28
Forks
22
Language
Java
License
—
Category
Last pushed
Feb 12, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/lejon/PartiallyCollapsedLDA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jma127/pyltr
Python learning to rank (LTR) toolkit
tensorflow/ranking
Learning to Rank in TensorFlow
evllabs/JGAAP
The Java Graphical Authorship Attribution Program
Bibliome/alvisnlp
ALvisNLP corpus processing engine
rosette-api/rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch