Tencent/AngelSlim

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

/ 100

Verified

When deploying large language or vision models, this tool helps you shrink their size and speed them up without losing accuracy. It takes your pre-trained large AI model and applies various compression techniques to produce a smaller, faster version suitable for efficient real-world use. AI engineers and MLOps specialists who need to run large models on resource-constrained hardware or in production environments will find this particularly useful.

536 stars. Actively maintained with 18 commits in the last 30 days. Available on PyPI.

Use this if you need to optimize large AI models (LLMs, VLMs) for faster inference, reduced memory footprint, or deployment on edge devices.

Not ideal if you are still in the early stages of model development and not yet focused on deployment efficiency.

AI model deployment Large language model optimization On-device AI MLOps Vision language model deployment

Maintenance 17 / 25

Adoption 10 / 25

Maturity 24 / 25

Community 19 / 25

How are scores calculated?

Stars

536

Forks

Language

Python

License

—

Related tools

nebuly-ai/optimate

A collection of libraries to optimise AI model performances

antgroup/glake

GLake: optimizing GPU memory management and IO transmission.

kyo-takano/chinchilla

A toolkit for scaling law research ⚖

liyucheng09/Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40%...

TsingmaoAI/MI-optimize

mi-optimize is a versatile tool designed for the quantization and evaluation of large language...

Explore LLM Tools

All categories Trending LLM Tool directory Insights