sergiog95/csabstracts
Dataset of scientific abstracts for the purpose of sentence classification
This dataset helps researchers, NLP engineers, and data scientists working with scientific literature to automatically categorize sentences within computer science abstracts. It takes raw abstract sentences and provides them pre-labeled with categories like 'Background,' 'Objective,' 'Methods,' 'Results,' and 'Conclusions.' This is ideal for training and evaluating machine learning models designed to understand the structure of scientific papers.
No commits in the last 6 months.
Use this if you need a pre-labeled collection of computer science abstract sentences to train or test models for automatic text summarization, information extraction, or scientific document understanding.
Not ideal if you are looking for abstracts outside of computer science or if you need full-text articles rather than just abstract sentences.
Stars
10
Forks
2
Language
—
License
—
Category
Last pushed
Sep 17, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/sergiog95/csabstracts"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
titipata/pubmed_parser
:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
nfflow/pubmedflow
Data Collection API for pubmed
greenelab/snorkeling
Extracting biomedical relationships from literature with Snorkel 🏊
purplepotion/sadrat
Smart Adverse Drug Reaction Assessment Tools.
KarelDO/BioDEX
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance.