daskol/llama.py

Python bindings to llama.cpp

33
/ 100
Emerging

This project helps developers integrate large language models (LLMs) like LLaMA directly into Python applications. It takes pre-trained LLaMA model weights and outputs a highly optimized, quantized version that can run on standard CPUs, including Apple Silicon. This is intended for Python developers who want to run powerful LLMs locally without relying on cloud services or high-end GPUs.

No commits in the last 6 months.

Use this if you are a Python developer looking to run LLaMA models efficiently on CPU hardware, even on laptops, with optimized performance and reduced memory footprint.

Not ideal if you need a plug-and-play LLM solution without any coding or if your primary goal is to train or fine-tune LLMs, as this focuses on inference.

Python development local AI inference large language models CPU optimization edge AI
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

27

Forks

3

Language

C

License

MIT

Last pushed

Mar 22, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/daskol/llama.py"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.