sebischair/Lbl2Vec

Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

/ 100

Established

This tool helps you quickly organize large collections of unlabeled documents by automatically assigning them to predefined categories or topics. You provide your documents and a set of keywords for each topic you're interested in, and the system identifies and retrieves documents that match those topics. This is ideal for researchers, analysts, or anyone managing extensive text archives who needs to find relevant information without manually sifting through everything.

187 stars. No commits in the last 6 months. Available on PyPI.

Use this if you have a lot of text documents and want to classify them into topics using just a few descriptive keywords per topic, without manually labeling any documents.

Not ideal if you need to classify documents based on very subtle or complex distinctions that can't be adequately captured by a few keywords per topic.

document-classification topic-modeling information-retrieval text-analysis knowledge-management

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 17 / 25

How are scores calculated?

Stars

187

Forks

Language

Python

License

BSD-3-Clause

Related tools

shibing624/similarities

Similarities: a toolkit for similarity calculation and semantic search....

explosion/sense2vec

🦆 Contextually-keyed word vectors

chakki-works/chakin

Simple downloader for pre-trained word vectors

pdrm83/sent2vec

How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.

maxoodf/word2vec

word2vec++ is a Distributed Representations of Words (word2vec) library and tools...

Explore Embedding Tools

All categories Trending Embeddings directory Insights