TsingmaoAI/MI-optimize
mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techniques empowers users to customize their approaches according to specific requirements and constraints, providing a high level of flexibility.
This tool helps machine learning engineers and researchers optimize large language models (LLMs) for real-time applications and resource-constrained devices. You can take a large language model and compress it using various quantization techniques. The result is a smaller, more efficient model that maintains high performance and can be deployed in a wider range of scenarios.
No commits in the last 6 months.
Use this if you need to reduce the computational and memory demands of large language models while preserving their performance for deployment.
Not ideal if you are working with small models or do not require specialized compression techniques for deployment.
Stars
25
Forks
5
Language
Python
License
—
Category
Last pushed
Nov 28, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/TsingmaoAI/MI-optimize"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Tencent/AngelSlim
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
nebuly-ai/optimate
A collection of libraries to optimise AI model performances
antgroup/glake
GLake: optimizing GPU memory management and IO transmission.
kyo-takano/chinchilla
A toolkit for scaling law research ⚖
liyucheng09/Selective_Context
Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40%...