Apple Silicon Llm Inference Transformer Models

There are 18 apple silicon llm inference models tracked. 1 score above 70 (verified tier). The highest-rated is Blaizzy/mlx-vlm at 81/100 with 2,287 stars. 1 of the top 10 are actively maintained.

Get all 18 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=apple-silicon-llm-inference&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models...

81
Verified
2 b4rtaz/distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to...

55
Established
3 armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models...

43
Emerging
4 microsoft/batch-inference

Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT...

41
Emerging
5 armbues/SiLLM-examples

Examples for using the SiLLM framework for training and running Large...

39
Emerging
6 kolinko/effort

An implementation of bucketMul LLM inference

38
Emerging
7 mrdbourke/mac-ml-speed-test

A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS.

36
Emerging
8 unisa-hpc/llm.sycl

The sycl version of llm.c (for the final project of HPC course 2024, UNISA)

30
Emerging
9 matt-k-wong/mlx-flash

Lightning-fast MLX utilities and optimizations for Apple Silicon

28
Experimental
10 Mattbusel/llm-cpp

The C++ LLM toolkit. 26 single-header libraries for streaming, caching, cost...

25
Experimental
11 jranaraki/vllm-fit

A CLI tool designed to simply recommend (conservative), and/or profile (to...

25
Experimental
12 mspronesti/llm.sycl

llm.c, but in SYCL/Intel oneAPI!

20
Experimental
13 tonoy30/Llama

Llama-2 on apple mac using gpu

18
Experimental
14 1amageek/swift-lm

Hugging Face native LLM inference on Apple Silicon via direct Metal

18
Experimental
15 TheseusInstitute/nix-exllama

Nix derivation for EXLlama

17
Experimental
16 adityonugrohoid/vllm-explorer

Probes and catalogs the full vLLM server API — endpoint reference, model...

14
Experimental
17 javi22020/batch-router

Batch LLM inference Python library

13
Experimental
18 echenim/hf-batch-downloader

Automate bulk downloads of Hugging Face LLMs with retry logic, manifest...

13
Experimental