mkearney/wactor
Word Factor Vectors
This tool helps researchers and analysts prepare text data for analysis. It takes a collection of text, like sentences or documents, and converts it into a structured numeric format that can be used for machine learning or statistical modeling. This is useful for anyone working with text data who needs to transform it into a quantifiable format.
No commits in the last 6 months.
Use this if you need to transform raw text into numerical representations, like term frequency matrices or TF-IDF scores, and efficiently split your dataset for model training and testing.
Not ideal if you primarily work with pre-vectorized numerical data or if your text analysis doesn't require converting strings to numeric vectors.
Stars
32
Forks
2
Language
R
License
—
Category
Last pushed
Dec 13, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mkearney/wactor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dselivanov/text2vec
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
vzhong/embeddings
Fast, DB Backed pretrained word embeddings for natural language processing.
dccuchile/spanish-word-embeddings
Spanish word embeddings computed with different methods and from different corpora
ncbi-nlp/BioSentVec
BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences
avidale/compress-fasttext
Tools for shrinking fastText models (in gensim format)