AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

/ 100

Established

This project helps AI developers and engineers deploy large language models (LLMs) and vision-language models (VLMs) efficiently on AXera's AI chips. It takes pre-trained LLM/VLM models and optimizes them to run directly on AX650A/N and AX630C chips, providing a fast way to evaluate model performance and build custom AI applications. The output is a runnable model on AXera hardware, enabling specialized AI assistants for various tasks.

142 stars.

Use this if you are an AI developer or embedded systems engineer working with AXera AI chips and need to deploy large language models or multimodal models for high-performance edge computing.

Not ideal if you are not working with AXera AI chips or if you are looking for a general-purpose LLM inference solution for standard CPU/GPU platforms.

edge-AI model-deployment embedded-systems LLM-acceleration computer-vision

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

142

Forks

Language

C++

License

BSD-3-Clause

Related models

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AmpereComputingAI/ampere_model_library

AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)

Explore Transformer Models

All categories Trending Transformer directory Insights