N1k1tung/infer-ring
Infer Ring is an iOS and macOS app that facilitates cross-device LLM inference using MLX
Infer Ring helps you run large AI models, known as LLMs, directly on your Apple devices even if a single device doesn't have enough memory. It takes the model you want to use and distributes it across multiple iPhones, iPads, and Macs, letting you interact with larger models locally. This is for researchers, developers, or hobbyists who want to experiment with large AI models without needing powerful cloud servers.
Use this if you want to run powerful large language models locally on your Apple devices by combining their memory, rather than relying on expensive cloud services.
Not ideal if you need extremely fast token generation speeds for real-time applications, as performance might be slightly slower compared to a single, very powerful machine.
Stars
9
Forks
1
Language
Swift
License
MIT
Category
Last pushed
Feb 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/N1k1tung/infer-ring"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jundot/omlx
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the...
josStorer/RWKV-Runner
A RWKV management and startup tool, full automation, only 8MB. And provides an interface...
jordanhubbard/nanolang
A tiny experimental language designed to be targeted by coding LLMs
waybarrios/vllm-mlx
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models...
akivasolutions/tightwad
Pool your CUDA + ROCm GPUs into one OpenAI-compatible API. Speculative decoding proxy gives you...