foldl/chatllm.cpp
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)
This project helps you run powerful conversational AI models directly on your own computer, even without a high-end setup. You can feed in text, images, or audio and get back real-time chat responses, summaries, or even generate new content. It's designed for individuals who want to use large language models for personal tasks, research, or creative writing with full control over their data.
831 stars. Actively maintained with 11 commits in the last 30 days.
Use this if you want to run various AI chat models privately on your desktop or laptop, customize their behavior, and experiment with different models for tasks like writing assistance, coding help, or general knowledge retrieval.
Not ideal if you're looking for a simple, cloud-based AI chat service that doesn't require any local setup or technical configuration.
Stars
831
Forks
62
Language
C++
License
MIT
Category
Last pushed
Mar 11, 2026
Commits (30d)
11
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/foldl/chatllm.cpp"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
containers/ramalama
RamaLama is an open-source developer tool that simplifies the local serving of AI models from...
av/harbor
One command brings a complete pre-wired LLM stack with hundreds of services to explore.
RunanywhereAI/runanywhere-sdks
Production ready toolkit to run AI locally
runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
FarisZahrani/llama-cpp-py-sync
Auto-synced CFFI ABI python bindings for llama.cpp with prebuilt wheels (CPU/CUDA/Vulkan/Metal).