medspacy/sectionizer
A rule-based Python module for spitting documents into sections.
This tool helps healthcare professionals and researchers automatically identify and label different sections within clinical documents like patient notes or discharge summaries. It takes unstructured medical text as input and outputs the same text with clearly marked sections, such as 'Chief Complaint', 'History of Present Illness', or 'Medications'. This is useful for anyone working with large volumes of clinical text who needs to quickly extract or organize information by section.
No commits in the last 6 months.
Use this if you need to programmatically identify and label standard sections within unstructured clinical text documents.
Not ideal if you are working with non-clinical documents or if you need to extract specific entities rather than document sections.
Stars
12
Forks
5
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Nov 14, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/medspacy/sectionizer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EmilStenstrom/conllu
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
zaemyung/sentsplit
A flexible sentence segmentation library using CRF model and regex rules
natasha/razdel
Rule-based token, sentence segmentation for Russian language