ddh0/easy-llama

Python package wrapping llama.cpp for on-device LLM inference

45
/ 100
Emerging

This is a Python toolkit designed for developers who want to integrate large language model (LLM) inference directly into their applications or services. It allows you to load various quantized LLM files (like GGUF) and run them locally on your own hardware, turning text input into text output. It's for developers building applications that need on-device AI capabilities.

101 stars. No commits in the last 6 months. Available on PyPI.

Use this if you are a developer looking to embed local LLM inference capabilities directly into your Python-based software, without relying on external cloud services.

Not ideal if you are an end-user without programming experience, or if you need a high-level API for model management and deployment rather than direct library integration.

on-device-AI LLM-integration local-inference python-development application-building
Stale 6m
Maintenance 2 / 25
Adoption 9 / 25
Maturity 25 / 25
Community 9 / 25

How are scores calculated?

Stars

101

Forks

6

Language

Python

License

MIT

Last pushed

Oct 12, 2025

Commits (30d)

0

Dependencies

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ddh0/easy-llama"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.