markusmobius/go-htmldate
CLI and Go package for extracting publication date of a web pages.
This tool helps researchers, journalists, and data analysts accurately determine when a web page was originally published or last updated. You provide a web page URL, and it gives you the publication date and optionally the time. It is designed for anyone needing to timestamp online content reliably, especially when source information is ambiguous.
No commits in the last 6 months.
Use this if you need to programmatically extract the publication date and time from a large number of web pages with high accuracy and speed.
Not ideal if your primary need is to extract the modified date, as its accuracy for this specific task has not been as thoroughly validated.
Stars
11
Forks
2
Language
HTML
License
Apache-2.0
Category
Last pushed
May 02, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/markusmobius/go-htmldate"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
adbar/htmldate
Fast and robust date extraction from web pages, with Python or on the command-line
alvinwan/timefhuman
Extract datetimes and durations from natural language text as Python objects. Supports ranges,...
mike182uk/timestring
Parse a human readable time string into a time based value
akoumjian/datefinder
Find dates inside text using Python and get back datetime objects
DanielJDufour/date-extractor
Extract dates from text