adbar/htmldate

Fast and robust date extraction from web pages, with Python or on the command-line

64
/ 100
Established

This tool helps researchers, content analysts, and anyone dealing with web archives to quickly and accurately find the publication date of web pages. You provide it with a web page URL or its HTML content, and it returns the original or updated publication date. It's designed for anyone who needs to verify when online content was published.

146 stars. Used by 3 other packages. Available on PyPI.

Use this if you need to reliably extract publication dates from a large number of web pages for research, content analysis, or data archiving.

Not ideal if you only need to find dates from a handful of pages manually, as browser developer tools might suffice.

web-archiving content-analysis data-journalism academic-research digital-librarianship
Maintenance 6 / 25
Adoption 13 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

146

Forks

29

Language

Python

License

Apache-2.0

Last pushed

Nov 04, 2025

Commits (30d)

0

Dependencies

5

Reverse dependents

3

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/adbar/htmldate"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.