princeton-nlp/LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
This project provides methods and pre-trained models for efficiently creating smaller, specialized large language models (LLMs). By 'shearing' or pruning an existing large model, you can significantly reduce its size and the computational resources needed for pre-training. It's ideal for AI/ML researchers and practitioners who want to develop cost-effective, high-performing small LLMs from larger base models.
642 stars. No commits in the last 6 months.
Use this if you need to create smaller, more efficient language models without incurring the massive pre-training costs of building one from scratch.
Not ideal if you are looking for a ready-to-use application or a no-code solution for deploying existing large language models.
Stars
642
Forks
57
Language
Python
License
MIT
Category
Last pushed
Mar 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/princeton-nlp/LLM-Shearing"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
peremartra/optipfair
Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization...
VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support...
CASIA-LMC-Lab/FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
VITA-Group/LiGO
[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer Training" by Peihao...