daskol/llama.py
Python bindings to llama.cpp
This project helps developers integrate large language models (LLMs) like LLaMA directly into Python applications. It takes pre-trained LLaMA model weights and outputs a highly optimized, quantized version that can run on standard CPUs, including Apple Silicon. This is intended for Python developers who want to run powerful LLMs locally without relying on cloud services or high-end GPUs.
No commits in the last 6 months.
Use this if you are a Python developer looking to run LLaMA models efficiently on CPU hardware, even on laptops, with optimized performance and reduced memory footprint.
Not ideal if you need a plug-and-play LLM solution without any coding or if your primary goal is to train or fine-tune LLMs, as this focuses on inference.
Stars
27
Forks
3
Language
C
License
MIT
Category
Last pushed
Mar 22, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/daskol/llama.py"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.