nerel-ds/NEREL-BIO

NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities

/ 100

Emerging

This project provides a specialized collection of biomedical research abstracts from PubMed, available in both Russian and English, with key terms and concepts meticulously tagged. It helps researchers, clinical data analysts, and anyone working with scientific literature to quickly identify and extract specific information like medical procedures, diseases, chemicals, and anatomical references, improving the efficiency of data extraction from complex texts. The corpus serves as input for building tools that can then output structured data from unstructured text.

Use this if you need high-quality, pre-annotated biomedical text data to train or evaluate systems that automatically identify entities within scientific articles, especially for nested entities (entities within other entities).

Not ideal if you are looking for a tool to directly perform text analysis on your own documents without needing to develop or train a model.

biomedical-research clinical-data-analysis medical-literature-review scientific-text-mining natural-language-processing-for-medicine

No License No Package No Dependents

Maintenance 10 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

MantisAI/nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13

dice-group/gerbil

GERBIL - General Entity annotatoR Benchmark

bltlab/seqscore

SeqScore: Scoring for named entity recognition and other sequence labeling tasks

syuoni/eznlp

Easy Natural Language Processing

LHNCBC/metamaplite

A near real-time named-entity recognizer

Explore NLP Tools

All categories Trending NLP directory Insights