sebischair/Lbl2Vec
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
This tool helps you quickly organize large collections of unlabeled documents by automatically assigning them to predefined categories or topics. You provide your documents and a set of keywords for each topic you're interested in, and the system identifies and retrieves documents that match those topics. This is ideal for researchers, analysts, or anyone managing extensive text archives who needs to find relevant information without manually sifting through everything.
187 stars. No commits in the last 6 months. Available on PyPI.
Use this if you have a lot of text documents and want to classify them into topics using just a few descriptive keywords per topic, without manually labeling any documents.
Not ideal if you need to classify documents based on very subtle or complex distinctions that can't be adequately captured by a few keywords per topic.
Stars
187
Forks
28
Language
Python
License
BSD-3-Clause
Category
Last pushed
Jan 31, 2024
Commits (30d)
0
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/sebischair/Lbl2Vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
shibing624/similarities
Similarities: a toolkit for similarity calculation and semantic search....
explosion/sense2vec
🦆 Contextually-keyed word vectors
chakki-works/chakin
Simple downloader for pre-trained word vectors
pdrm83/sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
maxoodf/word2vec
word2vec++ is a Distributed Representations of Words (word2vec) library and tools...