JackZeng0208/llama.cpp-android-tutorial

llama.cpp tutorial on Android phone

/ 100

Emerging

This project guides developers through the process of setting up and running large language models (LLMs) like Llama directly on Android phones equipped with Qualcomm Snapdragon processors. It shows how to compile the `llama.cpp` library to leverage the phone's Adreno GPU for faster processing. The output is a functional `llama.cpp` application, potentially integrated with Python, capable of performing local LLM inference on the device. This is for Android app developers or researchers interested in deploying and evaluating LLMs on mobile hardware.

155 stars. No commits in the last 6 months.

Use this if you are a developer looking to deploy and run large language models directly on an Android device with a Qualcomm Snapdragon SoC, utilizing its Adreno GPU for accelerated performance.

Not ideal if you are a general user wanting a ready-to-use LLM app, or if your Android device does not have a Qualcomm Snapdragon processor with an Adreno GPU.

mobile-LLM-deployment on-device-AI Android-development mobile-machine-learning edge-AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

155

Forks

Language

—

License

MIT

Higher-rated alternatives

beehive-lab/GPULlama3.java

GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

gitkaz/mlx_gguf_server

This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously...

srgtuszy/llama-cpp-swift

Swift bindings for llama-cpp library

awinml/llama-cpp-python-bindings

Run fast LLM Inference using Llama.cpp in Python

RhinoDevel/mt_llm

Pure C wrapper library to use llama.cpp with Linux and Windows as simple as possible.

Explore Transformer Models

All categories Trending Transformer directory Insights