Blaizzy/mlx-vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
This project helps you understand images, audio, and video content by describing or answering questions about them. You provide a visual, audio, or multi-modal input and a question or prompt, and the tool generates a textual response. It's designed for anyone working with multimedia content on a Mac who needs to extract information or generate descriptions.
2,287 stars. Used by 7 other packages. Actively maintained with 44 commits in the last 30 days. Available on PyPI.
Use this if you need to analyze images, audio, or video files to get textual descriptions or answers, especially for tasks like OCR or general content understanding, directly on your Mac.
Not ideal if you need a cloud-based solution or require support for operating systems other than macOS.
Stars
2,287
Forks
293
Language
Python
License
MIT
Category
Last pushed
Mar 11, 2026
Commits (30d)
44
Dependencies
12
Reverse dependents
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Blaizzy/mlx-vlm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
b4rtaz/distributed-llama
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM...
armbues/SiLLM
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple...
microsoft/batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
armbues/SiLLM-examples
Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on...
kolinko/effort
An implementation of bucketMul LLM inference