nareshis21/Truelarge-RT

Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices. Features proprietary Layer-by-Layer (LBL) streaming, zero-copy mmap loading, and native C++/Kotlin architecture.

34
/ 100
Emerging

This project helps Android app developers enable very large language models (LLMs) to run directly on consumer smartphones and tablets, even older devices with limited RAM. It takes a pre-trained LLM (like Llama 3) as input and allows the app to perform real-time text generation on the device, providing interactive AI capabilities without needing a constant internet connection. Mobile app developers who want to integrate powerful AI features into their Android applications will use this.

Use this if you are developing an Android application and need to run large language models locally on user devices, especially those with 4GB-8GB of RAM, without requiring the entire model to fit into memory.

Not ideal if your application runs on server-side infrastructure or if your target devices consistently have 12GB+ RAM where smaller models can run fully in memory for maximum speed.

mobile-app-development on-device-AI large-language-models Android-development edge-computing
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 11 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

Kotlin

License

MIT

Last pushed

Feb 21, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/nareshis21/Truelarge-RT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.