amazon-science/llm-rank-pruning

LLM-Rank: A graph theoretical approach to structured pruning of large language models based on weighted Page Rank centrality as introduced by the related paper.

/ 100

Emerging

This tool helps AI researchers and machine learning engineers reduce the size and computational cost of large language models (LLMs) without significantly losing performance. It takes a pre-trained LLM and calibration data, then identifies and removes less important parts of the model. The output is a smaller, more efficient pruned LLM.

No commits in the last 6 months.

Use this if you need to deploy large language models on resource-constrained devices or reduce their operational cost while maintaining strong performance.

Not ideal if you are looking for a tool to train LLMs from scratch or fine-tune them on new data, as this focuses solely on post-training model compression.

large-language-models model-optimization AI-research machine-learning-engineering model-compression

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

Tencent/AngelSlim

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

nebuly-ai/optimate

A collection of libraries to optimise AI model performances

antgroup/glake

GLake: optimizing GPU memory management and IO transmission.

kyo-takano/chinchilla

A toolkit for scaling law research ⚖

liyucheng09/Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40%...

Explore LLM Tools

All categories Trending LLM Tool directory Insights