ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

/ 100

Emerging

This project helps researchers and healthcare professionals analyze large volumes of biomedical text, such as scientific articles and clinical notes. It takes in words or sentences from these texts and outputs numerical representations (embeddings) that capture their meaning, making it easier to compare and process them. Medical researchers, clinicians, and data scientists working with health-related text data would find this useful.

611 stars. No commits in the last 6 months.

Use this if you need to understand the similarity between medical terms, concepts, or entire sentences from biomedical literature and clinical records.

Not ideal if your text data is outside of the biomedical or clinical domain, as the models are specifically trained on PubMed articles and MIMIC-III clinical notes.

biomedical-research clinical-text-analysis medical-informatics scientific-literature-analysis healthcare-data-mining

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

611

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

dselivanov/text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

dccuchile/spanish-word-embeddings

Spanish word embeddings computed with different methods and from different corpora

avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

ibrahimsharaf/doc2vec

:notebook: Long(er) text representation and classification using Doc2Vec embeddings

Explore NLP Tools

All categories Trending NLP directory Insights