reidbarber/webmarker
Mark web pages for use with vision-language models
This tool helps web automation engineers and AI developers create more precise interactions with web pages. It takes any live web page and visually marks interactive elements like buttons and links with unique labels and bounding boxes. The output is a screenshot of the marked page that can be fed to a vision-language model, along with a mapping of labels to elements for programmatic interaction, enabling more accurate AI-driven web navigation.
Use this if you need to reliably identify and interact with specific elements on a web page using a vision-language model or an automated script, especially for building AI agents that browse the web.
Not ideal if you are looking for a simple web scraping tool for data extraction or a general-purpose browser automation library without the need for vision-language model integration.
Stars
58
Forks
4
Language
TypeScript
License
MIT
Category
Last pushed
Mar 08, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/reidbarber/webmarker"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
microsoft/promptflow
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
pezzolabs/pezzo
🕹️ Open-source, developer-first LLMOps platform designed to streamline prompt design, version...
cremich/promptz
Resource Library for AI-assisted software development with kiro
scafoldr/scafoldr
Building an open-source alternative to v0 and Lovable.
promptdesk/promptdesk
Promptdesk is a tool designed for effectively creating, organizing, and evaluating prompts and...