dselivanov/text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

/ 100

Established

This R package helps data scientists and researchers analyze large collections of text efficiently. You can input raw text documents and get back numerical representations (vectors) of words or documents, along with tools for identifying key themes. This is designed for practitioners working with substantial textual data who need to process it quickly without running out of memory.

870 stars.

Use this if you are an R user needing to perform fast, memory-efficient text analysis and topic modeling on large datasets.

Not ideal if you prefer a graphical user interface or are not comfortable with programming in R.

text-analysis natural-language-processing topic-modeling data-science computational-linguistics

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

870

Forks

134

Language

License

—

Related tools

vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

dccuchile/spanish-word-embeddings

Spanish word embeddings computed with different methods and from different corpora

ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

ibrahimsharaf/doc2vec

:notebook: Long(er) text representation and classification using Doc2Vec embeddings

avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

Explore NLP Tools

All categories Trending NLP directory Insights