Tencent/AngelSlim
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
When deploying large language or vision models, this tool helps you shrink their size and speed them up without losing accuracy. It takes your pre-trained large AI model and applies various compression techniques to produce a smaller, faster version suitable for efficient real-world use. AI engineers and MLOps specialists who need to run large models on resource-constrained hardware or in production environments will find this particularly useful.
536 stars. Actively maintained with 18 commits in the last 30 days. Available on PyPI.
Use this if you need to optimize large AI models (LLMs, VLMs) for faster inference, reduced memory footprint, or deployment on edge devices.
Not ideal if you are still in the early stages of model development and not yet focused on deployment efficiency.
Stars
536
Forks
68
Language
Python
License
—
Category
Last pushed
Mar 12, 2026
Commits (30d)
18
Dependencies
13
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Tencent/AngelSlim"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
nebuly-ai/optimate
A collection of libraries to optimise AI model performances
antgroup/glake
GLake: optimizing GPU memory management and IO transmission.
kyo-takano/chinchilla
A toolkit for scaling law research ⚖
liyucheng09/Selective_Context
Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40%...
TsingmaoAI/MI-optimize
mi-optimize is a versatile tool designed for the quantization and evaluation of large language...