ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
This tool helps researchers, marketers, or anyone analyzing large text collections understand the main themes and sub-topics within their documents. You input a collection of text documents, and it outputs a list of detected topics, the words most relevant to each topic, and even identifies which parts of a document relate to specific topics. It's ideal for practitioners who need to automatically discover, categorize, and explore subjects across many documents.
3,109 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to automatically discover the key themes and sub-themes within a large collection of text documents, even when documents cover multiple distinct subjects.
Not ideal if you need a simple count of predefined keywords or if your documents are extremely short and lack contextual richness.
Stars
3,109
Forks
377
Language
Python
License
BSD-3-Clause
Category
Last pushed
Nov 14, 2024
Commits (30d)
0
Dependencies
9
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ddangelov/Top2Vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
shibing624/text2vec
text2vec, text to vector....
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings
natasha/navec
Compact high quality word embeddings for Russian language