Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

/ 100

Verified

This project helps you understand images, audio, and video content by describing or answering questions about them. You provide a visual, audio, or multi-modal input and a question or prompt, and the tool generates a textual response. It's designed for anyone working with multimedia content on a Mac who needs to extract information or generate descriptions.

2,287 stars. Used by 7 other packages. Actively maintained with 44 commits in the last 30 days. Available on PyPI.

Use this if you need to analyze images, audio, or video files to get textual descriptions or answers, especially for tasks like OCR or general content understanding, directly on your Mac.

Not ideal if you need a cloud-based solution or require support for operating systems other than macOS.

multimedia-analysis content-understanding image-description audio-analysis document-processing

Maintenance 20 / 25

Adoption 15 / 25

Maturity 25 / 25

Community 21 / 25

How are scores calculated?

Stars

2,287

Forks

293

Language

Python

License

MIT

Compare

mlx-vlm and SiLLM mlx-vlm and mlx-flash mlx-vlm and vllm-fit

Related models

b4rtaz/distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM...

armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple...

microsoft/batch-inference

Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.

armbues/SiLLM-examples

Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on...

kolinko/effort

An implementation of bucketMul LLM inference

Explore Transformer Models

All categories Trending Transformer directory Insights