practikpharma/PGxCorpus
PGxCorpus, a manually annotated corpus, designed for the extraction of pharmacogenomic relations from text.
Pharmacogenomics researchers or clinical pharmacologists can use this manually annotated collection of scientific sentences to understand relationships between genes, drugs, and diseases. It takes text from PubMed abstracts and highlights key pharmacogenomic entities and their connections. This resource is ideal for those studying drug response variability based on genetic factors.
No commits in the last 6 months.
Use this if you need a meticulously categorized dataset of pharmacogenomic information to train or validate systems that automatically extract drug-gene interactions from scientific literature.
Not ideal if you are looking for a tool to perform live text analysis or to directly query a database of pharmacogenomic facts rather than a corpus for machine learning.
Stars
8
Forks
4
Language
Lua
License
—
Category
Last pushed
Oct 28, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/practikpharma/PGxCorpus"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Helsinki-NLP/OpusFilter
OpusFilter - Parallel corpus processing toolkit
natasha/corus
Links to Russian corpora + Python functions for loading and parsing
darija-open-dataset/dataset
darija <-> english dataset
omicsNLP/Auto-CORPus
Auto-CORPus pipeline developed by a University of Nottingham and Imperial College London...
SergeyShk/ruTS
Библиотека для извлечения статистик из текстов на русском языке.