salgadev/medical-nlp

Dataset for Natural Language Processing using a corpus of medical transcriptions and custom-generated clinical stop words and vocabulary.

/ 100

Emerging

This project provides pre-processed medical text data and specialized vocabulary for anyone building tools to analyze clinical documentation. It takes raw medical transcriptions and curated clinical terms, outputting cleaned datasets ready for training machine learning models that can categorize or understand medical text. It's ideal for data scientists or researchers focusing on healthcare applications.

No commits in the last 6 months.

Use this if you need a ready-made dataset of medical transcriptions, clinical stop words, and a SNMI-based vocabulary to jumpstart your natural language processing project in healthcare.

Not ideal if your project requires highly specialized medical text from a different domain or if you need to build your own custom vocabulary from scratch without external resources.

medical-transcriptions clinical-documentation-analysis healthcare-nlp biomedical-text-mining

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

—

License

GPL-3.0

Higher-rated alternatives

medspacy/medspacy

Library for clinical NLP with spaCy.

jamesmullenbach/caml-mimic

multilabel classification of EHR notes

ncbi-nlp/NegBio

:newspaper: High-performance tool for negation and uncertainty detection in radiology reports

bionlplab/radtext

Python Radiology Text Analysis System

ClarityNLP/ClarityNLP

An NLP framework for clinical phenotyping. Docker | Python | Solr | OMOP....

Explore NLP Tools

All categories Trending NLP directory Insights