madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.

64
/ 100
Established

This project helps developers integrate powerful AI capabilities directly into their Mac applications, without sending data to external cloud services. It takes local AI models and provides standard API endpoints, similar to OpenAI or Anthropic, allowing your existing code to run these models on your Apple Silicon chip. This is ideal for developers building Mac-based applications that require secure, high-performance local AI processing.

678 stars. Available on PyPI.

Use this if you are a developer building an application on an Apple Silicon Mac and want to use AI models locally for chat, audio processing, image generation, or embeddings, while maintaining privacy and control.

Not ideal if you need to run AI models on Windows or Linux, or if your application requires the scale and features of cloud-based AI services.

macOS-development local-AI application-development privacy-preserving-AI offline-AI
Maintenance 10 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 19 / 25

How are scores calculated?

Stars

678

Forks

84

Language

Python

License

MIT

Last pushed

Mar 10, 2026

Commits (30d)

0

Dependencies

15

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/madroidmaq/mlx-omni-server"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.