ngxson/wllama

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

49
/ 100
Emerging

This project helps web developers integrate large language models (LLMs) directly into web browsers. It takes LLM model files (GGUF format) and outputs generated text completions or embeddings, all running client-side. Web application developers are the primary users, enabling powerful AI features without needing a backend server or specialized GPU on the user's machine.

1,013 stars.

Use this if you are a web developer building interactive browser-based applications that need to perform natural language processing tasks using LLMs without relying on a server.

Not ideal if you need to run models larger than 2GB on a single file, require WebGPU support, or prefer server-side inference for very large-scale or high-performance applications.

web-development on-device-ai natural-language-processing browser-applications client-side-inference
No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

1,013

Forks

73

Language

TypeScript

License

MIT

Last pushed

Dec 17, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ngxson/wllama"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.