nerel-ds/NEREL-BIO

NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities

31
/ 100
Emerging

This project provides a specialized collection of biomedical research abstracts from PubMed, available in both Russian and English, with key terms and concepts meticulously tagged. It helps researchers, clinical data analysts, and anyone working with scientific literature to quickly identify and extract specific information like medical procedures, diseases, chemicals, and anatomical references, improving the efficiency of data extraction from complex texts. The corpus serves as input for building tools that can then output structured data from unstructured text.

Use this if you need high-quality, pre-annotated biomedical text data to train or evaluate systems that automatically identify entities within scientific articles, especially for nested entities (entities within other entities).

Not ideal if you are looking for a tool to directly perform text analysis on your own documents without needing to develop or train a model.

biomedical-research clinical-data-analysis medical-literature-review scientific-text-mining natural-language-processing-for-medicine
No License No Package No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 6 / 25

How are scores calculated?

Stars

30

Forks

2

Language

Python

License

Last pushed

Feb 09, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/nerel-ds/NEREL-BIO"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.