LLM-Pruner and LLM-Shearing

These are **competitors** — both implement structural pruning to reduce LLM size and latency, but LLM-Pruner offers a general pruning framework applicable to multiple architectures, while LLM-Shearing proposes a specific pre-training-aware pruning approach optimized for LLaMA models.

LLM-Pruner
47
Emerging
LLM-Shearing
42
Emerging
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 21/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 16/25
Stars: 1,109
Forks: 130
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 642
Forks: 57
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About LLM-Pruner

horseee/LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

This project helps machine learning engineers and researchers reduce the size of large language models (LLMs) like Llama, BLOOM, and Vicuna. By taking an existing LLM as input, it prunes unnecessary components while aiming to maintain its multi-task abilities. The output is a smaller, more efficient LLM that uses less computational resources, allowing for easier deployment and faster inference.

Large Language Models Model Compression Deep Learning Deployment AI Efficiency Resource Optimization

About LLM-Shearing

princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

This project provides methods and pre-trained models for efficiently creating smaller, specialized large language models (LLMs). By 'shearing' or pruning an existing large model, you can significantly reduce its size and the computational resources needed for pre-training. It's ideal for AI/ML researchers and practitioners who want to develop cost-effective, high-performing small LLMs from larger base models.

large-language-models model-optimization deep-learning-research computational-efficiency natural-language-processing

Scores updated daily from GitHub, PyPI, and npm data. How scores work