dusty-nv/NanoLLM
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
NanoLLM helps developers integrate advanced AI capabilities directly into applications running on NVIDIA Jetson devices, without needing cloud services. It takes various forms of input like text, images, or speech and processes them with optimized AI models to generate text, analyze content, or control agents. This tool is for engineers and developers building edge AI applications for robotics, smart cameras, or other embedded systems.
359 stars. No commits in the last 6 months.
Use this if you are a developer building AI applications for NVIDIA Jetson devices and need to run large language models, vision models, or multimodal agents efficiently on the device itself.
Not ideal if you are looking for a pre-built end-user application or if you are not developing for NVIDIA Jetson edge devices.
Stars
359
Forks
63
Language
Python
License
MIT
Category
Last pushed
Oct 18, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/dusty-nv/NanoLLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
peremartra/Large-Language-Model-Notebooks-Course
Practical course about Large Language Models.
ISNE11/CheatSheet-LLM
Run local Large Language Models (LLMs) offline using Ollama – interact with textbooks and custom...
SuperDev699/CheatSheet-LLM
🛠️ Run local Large Language Models offline with ease using Ollama for streamlined access and interaction.
Bladerex24/simple-llm
🚀 Explore a minimal, extensible LLM inference engine for efficient AI model execution, designed...