ybubnov/metalchat

Pure C++23 Llama inference for Apple Silicon chips

/ 100

Emerging

This is a C++ library and command-line tool that lets developers integrate Meta Llama AI models directly into their applications or run them from the terminal. It's specifically built for Apple Silicon chips, allowing local inference of large language models. The end-user persona is a software developer creating applications for macOS or iOS, or a power user who wants to interact with Llama models on their Apple machine without cloud services.

Use this if you are a developer building an application for Apple Silicon and need to embed or utilize Llama language models directly on the user's device for local, private, or offline AI capabilities.

Not ideal if you are an end-user without programming experience looking for a ready-to-use desktop application for Llama models, or if you need to deploy models on non-Apple hardware.

macOS-development local-AI-inference large-language-models Apple-Silicon C++-development

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

C++

License

GPL-3.0

Higher-rated alternatives

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights