IDEA-Research/DINO-X-MCP

Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.

42
/ 100
Emerging

This project integrates DINO-X's advanced image analysis capabilities into large language models (LLMs), allowing them to "see" and understand images. It takes an image and optional text prompts as input, and outputs detailed information about objects within the image, including their categories, counts, locations, and descriptive captions. This tool is designed for developers building multimodal AI applications that require rich visual context from images.

113 stars.

Use this if you are a developer building an AI application or agent that needs to understand and describe the contents of images, perform object counting, or locate specific features within an image.

Not ideal if you are looking for a standalone end-user application for image editing or simple photo organization without integrating it into a larger AI system.

AI application development multimodal AI computer vision integration object detection image captioning
No Package No Dependents
Maintenance 6 / 25
Adoption 9 / 25
Maturity 15 / 25
Community 12 / 25

How are scores calculated?

Stars

113

Forks

11

Language

TypeScript

License

Apache-2.0

Last pushed

Oct 28, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mcp/IDEA-Research/DINO-X-MCP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.