adbar/htmldate
Fast and robust date extraction from web pages, with Python or on the command-line
This tool helps researchers, content analysts, and anyone dealing with web archives to quickly and accurately find the publication date of web pages. You provide it with a web page URL or its HTML content, and it returns the original or updated publication date. It's designed for anyone who needs to verify when online content was published.
146 stars. Used by 3 other packages. Available on PyPI.
Use this if you need to reliably extract publication dates from a large number of web pages for research, content analysis, or data archiving.
Not ideal if you only need to find dates from a handful of pages manually, as browser developer tools might suffice.
Stars
146
Forks
29
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 04, 2025
Commits (30d)
0
Dependencies
5
Reverse dependents
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/adbar/htmldate"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
alvinwan/timefhuman
Extract datetimes and durations from natural language text as Python objects. Supports ranges,...
mike182uk/timestring
Parse a human readable time string into a time based value
akoumjian/datefinder
Find dates inside text using Python and get back datetime objects
DanielJDufour/date-extractor
Extract dates from text
xkzhangsan/xk-time
xk-time...