NIHOPA/word2vec_pipeline
NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)
This tool helps researchers analyze large collections of text, like biomedical grants or publication abstracts. You feed in raw text documents, and it processes them to identify key phrases, clean up language, and convert words and documents into numerical representations. This allows for tasks like grouping similar documents together or making predictions based on text content, ultimately helping research analysts understand patterns within their data.
116 stars. No commits in the last 6 months.
Use this if you need to extract insights, cluster similar documents, or build predictive models from large volumes of unstructured text data in a research context.
Not ideal if you're looking for a simple keyword search tool or don't need advanced linguistic processing and machine learning capabilities for your text analysis.
Stars
116
Forks
17
Language
Python
License
—
Category
Last pushed
May 03, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/NIHOPA/word2vec_pipeline"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Planeshifter/node-word2vec
Node.js interface to the Google word2vec tool.
nathanrooy/word2vec-from-scratch-with-python
A very simple, bare-bones, inefficient, implementation of skip-gram word2vec from scratch with Python
thunlp/paragraph2vec
Paragraph Vector Implementation
akoksal/Turkish-Word2Vec
Pre-trained Word2Vec Model for Turkish
RichDavis1/PHPW2V
A PHP implementation of Word2Vec, a popular word embedding algorithm created by Tomas Mikolov...