mechramc/Orion

Local AI runtime for training & running small LLMs directly on Apple Neural Engine (ANE). No CoreML. No Metal. Offline, on-device fine-tuning & inference on M-series silicon.

/ 100

Emerging

This tool helps AI practitioners train and run small language models (LLMs) directly on their Apple M-series devices, using the dedicated Neural Engine. You can input text prompts for immediate responses or provide a dataset to fine-tune a model. It's designed for machine learning engineers and researchers who want to develop and test LLMs locally without relying on cloud services or external GPUs.

Use this if you need to fine-tune or run small LLMs efficiently and privately on your Apple Silicon device, leveraging its dedicated AI hardware.

Not ideal if you need to work with very large, general-purpose LLMs that require massive computational power, or if you prefer a cloud-based solution.

on-device-AI LLM-fine-tuning edge-AI machine-learning-engineering AI-research

No Package No Dependents

Maintenance 10 / 25

Adoption 7 / 25

Maturity 11 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Objective-C

License

MIT

Higher-rated alternatives

OpenNMT/CTranslate2

Fast inference engine for Transformer models

Pomilon/LEMA

LEMA (Layer-wise Efficient Memory Abstraction): A hardware-aware framework for fine-tuning LLMs...

dilbersha/llm-inference-benchmarking-3080

A production-grade telemetry-aware suite for benchmarking LLM inference performance on NVIDIA RTX 3080.

Yuan-ManX/infera

Infera — A High-Performance Inference Engine for Large Language Models.

gxcsoccer/alloy

Hybrid SSM-Attention language model on Apple Silicon with MLX — interleaving Mamba-2 and...

Explore Transformer Models

All categories Trending Transformer directory Insights