justADeni/intel-npu-llm

A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)

/ 100

Emerging

This project helps developers run large language models (LLMs) directly on Intel's Neural Processing Units (NPUs) for local inference. It takes a pre-trained LLM and processes it to run efficiently on NPU-equipped Intel processors, providing a ready-to-use local model for AI-powered applications. It's intended for developers building applications that need to integrate local LLM capabilities, especially on devices with Intel Core Ultra processors.

Use this if you are a developer looking to deploy large language models on Intel NPU-equipped devices for faster and more power-efficient local inference in your applications.

Not ideal if you don't have an Intel processor with an NPU or if you are not a developer and simply want to use an off-the-shelf AI chat application.

AI-development edge-AI local-inference model-deployment on-device-AI

No Package No Dependents

Maintenance 6 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

Explore Transformer Models

All categories Trending Transformer directory Insights