addy999/omniparser-api

Self-hosted version of Microsoft's OmniParser Image-to-text model

39
/ 100
Emerging

This tool helps software developers integrate the OmniParser image-to-text model directly into their applications. It takes a screenshot or UI image as input and outputs structured data about the UI elements, including their text, descriptions, and clickable regions. This is ideal for developers building AI agents that interact with user interfaces or automate web workflows.

No commits in the last 6 months.

Use this if you are a developer building an application that needs to programmatically understand and interact with UI elements from screenshots, and you require fast, self-hosted processing without rate limits.

Not ideal if you are looking for a simple web-based tool for one-off image-to-text conversions or if you do not have the technical expertise to deploy and manage a Dockerized application.

AI agent development UI automation web scraping computer vision application integration
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 9 / 25
Maturity 8 / 25
Community 20 / 25

How are scores calculated?

Stars

83

Forks

23

Language

Python

License

Last pushed

May 29, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/addy999/omniparser-api"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.