LLM-Shearing and LLaMA-Pruning

These are competitors—both implement structured pruning approaches to reduce LLaMA model size and latency, with LLM-Shearing being the more established academic solution (ICLR 2024 publication, 10x more stars) while LLaMA-Pruning offers an alternative implementation of similar structural pruning techniques.

LLM-Shearing

Emerging

LLaMA-Pruning

Emerging

Maintenance 0/25

Adoption 10/25

Maturity 16/25

Community 16/25

Maintenance 0/25

Adoption 8/25

Maturity 16/25

Community 9/25

Stars: 642

Forks: 57

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stars: 54

Forks: 4

Downloads: —

Commits (30d): 0

Language: Python

License: GPL-3.0

Stale 6m No Package No Dependents

Archived Stale 6m No Package No Dependents

About LLM-Shearing

princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

This project provides methods and pre-trained models for efficiently creating smaller, specialized large language models (LLMs). By 'shearing' or pruning an existing large model, you can significantly reduce its size and the computational resources needed for pre-training. It's ideal for AI/ML researchers and practitioners who want to develop cost-effective, high-performing small LLMs from larger base models.

large-language-models model-optimization deep-learning-research computational-efficiency natural-language-processing

About LLaMA-Pruning

horseee/LLaMA-Pruning

Structural Pruning for LLaMA

Related comparisons

LLM-Shearing and LLM-Pruner LLM-Shearing and LLM-Pruner

Scores updated daily from GitHub, PyPI, and npm data. How scores work