lrusso/llama3pure
Three inference engines for Llama 3: pure C for desktop systems, pure JavaScript for Node.js, and pure JavaScript for Web environments.
This project offers self-contained inference engines for Llama 3 and Gemma models, allowing you to run powerful language models directly on your desktop or in web browsers. It takes a GGUF model file and a text prompt or chat history as input, producing generated text as output. Software developers or engineers who want to integrate large language model capabilities into their applications without relying on cloud APIs would use this.
Use this if you need to run Llama 3 or Gemma large language models locally on a desktop or directly within a web application, providing a self-contained AI experience.
Not ideal if you prefer to use cloud-based LLM services or if you need to fine-tune models, as this project focuses on inference only.
Stars
21
Forks
1
Language
HTML
License
MIT
Category
Last pushed
Feb 26, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/lrusso/llama3pure"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.