dusty-nv/NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

/ 100

Emerging

NanoLLM helps developers integrate advanced AI capabilities directly into applications running on NVIDIA Jetson devices, without needing cloud services. It takes various forms of input like text, images, or speech and processes them with optimized AI models to generate text, analyze content, or control agents. This tool is for engineers and developers building edge AI applications for robotics, smart cameras, or other embedded systems.

359 stars. No commits in the last 6 months.

Use this if you are a developer building AI applications for NVIDIA Jetson devices and need to run large language models, vision models, or multimodal agents efficiently on the device itself.

Not ideal if you are looking for a pre-built end-user application or if you are not developing for NVIDIA Jetson edge devices.

edge-ai robotics-development embedded-systems ai-application-development local-inference

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

359

Forks

Language

Python

License

MIT

Higher-rated alternatives

peremartra/Large-Language-Model-Notebooks-Course

Practical course about Large Language Models.

ISNE11/CheatSheet-LLM

Run local Large Language Models (LLMs) offline using Ollama – interact with textbooks and custom...

SuperDev699/CheatSheet-LLM

🛠️ Run local Large Language Models offline with ease using Ollama for streamlined access and interaction.

Bladerex24/simple-llm

🚀 Explore a minimal, extensible LLM inference engine for efficient AI model execution, designed...

Explore Transformer Models

All categories Trending Transformer directory Insights