allenai/scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

64
/ 100
Established

This project helps scientists and researchers automatically understand specialized scientific and biomedical texts. It takes raw text like journal articles or research papers and extracts key information, identifying important scientific terms, their definitions, and linking them to established medical or biological databases. Biomedical researchers, clinical scientists, and anyone working with large volumes of scientific literature would find this useful.

1,934 stars. Used by 2 other packages. Available on PyPI.

Use this if you need to quickly and accurately identify and categorize scientific entities, abbreviations, or connect terms to medical knowledge bases within large sets of scientific documents.

Not ideal if your documents are not scientific or biomedical in nature, or if you need to analyze text in languages other than English.

biomedical-research clinical-text-analysis medical-literature genomics drug-discovery
Maintenance 6 / 25
Adoption 12 / 25
Maturity 25 / 25
Community 21 / 25

How are scores calculated?

Stars

1,934

Forks

249

Language

Python

License

Apache-2.0

Last pushed

Dec 04, 2025

Commits (30d)

0

Dependencies

10

Reverse dependents

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/allenai/scispacy"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.