reidbarber/webmarker

Mark web pages for use with vision-language models

42
/ 100
Emerging

This tool helps web automation engineers and AI developers create more precise interactions with web pages. It takes any live web page and visually marks interactive elements like buttons and links with unique labels and bounding boxes. The output is a screenshot of the marked page that can be fed to a vision-language model, along with a mapping of labels to elements for programmatic interaction, enabling more accurate AI-driven web navigation.

Use this if you need to reliably identify and interact with specific elements on a web page using a vision-language model or an automated script, especially for building AI agents that browse the web.

Not ideal if you are looking for a simple web scraping tool for data extraction or a general-purpose browser automation library without the need for vision-language model integration.

web-automation AI-agents vision-language-models robot-process-automation web-testing
No Package No Dependents
Maintenance 10 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

58

Forks

4

Language

TypeScript

License

MIT

Last pushed

Mar 08, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/reidbarber/webmarker"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.