jundot/omlx

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

62
/ 100
Established

oMLX helps individual developers and power users on Apple Silicon Macs efficiently run and manage large language models (LLMs) and vision-language models (VLMs) directly on their machines. It takes a model file and provides a local API endpoint and a web dashboard, allowing you to interact with models for tasks like code generation, content creation, or image analysis. This is for developers or technical users who want to run powerful AI models locally without relying on cloud services.

4,057 stars. Actively maintained with 448 commits in the last 30 days.

Use this if you are a developer or AI enthusiast using an Apple Silicon Mac and want to run multiple large language models or vision models locally with optimal performance and easy management.

Not ideal if you need to deploy AI models on non-Apple hardware, prefer cloud-based inference, or do not have a technical background.

local-AI-inference Apple-Silicon-ML LLM-deployment VLM-applications developer-tools
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 11 / 25
Community 19 / 25

How are scores calculated?

Stars

4,057

Forks

306

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

448

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/jundot/omlx"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.