titipata/pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset

62
/ 100
Established

This tool helps researchers, scientists, and medical professionals automatically extract specific information from PubMed Open-Access XML and MEDLINE XML files. You feed it scientific article data in XML format, and it gives you structured information like titles, abstracts, authors, references, image captions, and even full paragraphs in a clean, easy-to-use format. This is ideal for anyone working with large collections of biomedical literature who needs to pull out specific details for analysis.

727 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to systematically extract structured data from vast collections of PubMed Open-Access or MEDLINE XML articles for research, text mining, or natural language processing.

Not ideal if you only need to look up a few articles manually or prefer to work directly with web interfaces instead of programmatic data extraction.

biomedical-research scientific-literature-analysis medical-data-extraction academic-text-mining bibliographic-analysis
Stale 6m
Maintenance 2 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 25 / 25

How are scores calculated?

Stars

727

Forks

178

Language

Python

License

MIT

Last pushed

Jul 31, 2025

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/titipata/pubmed_parser"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.