Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

81
/ 100
Verified

This project helps you understand images, audio, and video content by describing or answering questions about them. You provide a visual, audio, or multi-modal input and a question or prompt, and the tool generates a textual response. It's designed for anyone working with multimedia content on a Mac who needs to extract information or generate descriptions.

2,287 stars. Used by 7 other packages. Actively maintained with 44 commits in the last 30 days. Available on PyPI.

Use this if you need to analyze images, audio, or video files to get textual descriptions or answers, especially for tasks like OCR or general content understanding, directly on your Mac.

Not ideal if you need a cloud-based solution or require support for operating systems other than macOS.

multimedia-analysis content-understanding image-description audio-analysis document-processing
Maintenance 20 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 21 / 25

How are scores calculated?

Stars

2,287

Forks

293

Language

Python

License

MIT

Last pushed

Mar 11, 2026

Commits (30d)

44

Dependencies

12

Reverse dependents

7

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Blaizzy/mlx-vlm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.