LLM-Shearing and LLaMA-Pruning

These are competitors—both implement structured pruning approaches to reduce LLaMA model size and latency, with LLM-Shearing being the more established academic solution (ICLR 2024 publication, 10x more stars) while LLaMA-Pruning offers an alternative implementation of similar structural pruning techniques.

LLM-Shearing
42
Emerging
LLaMA-Pruning
33
Emerging
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 16/25
Maintenance 0/25
Adoption 8/25
Maturity 16/25
Community 9/25
Stars: 642
Forks: 57
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stars: 54
Forks: 4
Downloads:
Commits (30d): 0
Language: Python
License: GPL-3.0
Stale 6m No Package No Dependents
Archived Stale 6m No Package No Dependents

About LLM-Shearing

princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

This project provides methods and pre-trained models for efficiently creating smaller, specialized large language models (LLMs). By 'shearing' or pruning an existing large model, you can significantly reduce its size and the computational resources needed for pre-training. It's ideal for AI/ML researchers and practitioners who want to develop cost-effective, high-performing small LLMs from larger base models.

large-language-models model-optimization deep-learning-research computational-efficiency natural-language-processing

About LLaMA-Pruning

horseee/LLaMA-Pruning

Structural Pruning for LLaMA

Scores updated daily from GitHub, PyPI, and npm data. How scores work