inejc/paragraph-vectors
:page_facing_up: A PyTorch implementation of Paragraph Vectors (doc2vec).
This tool helps you analyze collections of text documents, like articles or papers, by converting them into numerical representations. You input a CSV file where each row is a document, and it outputs a set of 'document vectors' — numerical codes that capture the meaning of each document. This is useful for researchers, data scientists, or anyone who needs to find patterns, similarities, or relationships within large text datasets.
415 stars. No commits in the last 6 months.
Use this if you need to transform a large collection of documents into numerical data for tasks like clustering, classification, or similarity search.
Not ideal if you are looking for a ready-to-use application with a graphical interface, as this requires command-line interaction and some technical setup.
Stars
415
Forks
75
Language
Python
License
MIT
Category
Last pushed
Dec 08, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/inejc/paragraph-vectors"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kermitt2/delft
a Deep Learning Framework for Text https://delft.readthedocs.io/
yoeo/guesslang
Detect the programming language of a source code
matthewdeanmartin/whats_that_code
detect programming language of source in pure python from an ensemble of classifiers
airalcorn2/Deep-Semantic-Similarity-Model
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent...
christiansafka/img2vec
:fire: Use pre-trained models in PyTorch to extract vector embeddings for any image