LLM-Pruner and LLaMA-Pruning
LLM-Pruner is a generalized structural pruning framework that evolved from and supersedes the earlier LLaMA-Pruning project, extending the same pruning methodology across multiple model architectures beyond just LLaMA.
About LLM-Pruner
horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
This project helps machine learning engineers and researchers reduce the size of large language models (LLMs) like Llama, BLOOM, and Vicuna. By taking an existing LLM as input, it prunes unnecessary components while aiming to maintain its multi-task abilities. The output is a smaller, more efficient LLM that uses less computational resources, allowing for easier deployment and faster inference.
About LLaMA-Pruning
horseee/LLaMA-Pruning
Structural Pruning for LLaMA
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work