princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

42
/ 100
Emerging

This project provides methods and pre-trained models for efficiently creating smaller, specialized large language models (LLMs). By 'shearing' or pruning an existing large model, you can significantly reduce its size and the computational resources needed for pre-training. It's ideal for AI/ML researchers and practitioners who want to develop cost-effective, high-performing small LLMs from larger base models.

642 stars. No commits in the last 6 months.

Use this if you need to create smaller, more efficient language models without incurring the massive pre-training costs of building one from scratch.

Not ideal if you are looking for a ready-to-use application or a no-code solution for deploying existing large language models.

large-language-models model-optimization deep-learning-research computational-efficiency natural-language-processing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

642

Forks

57

Language

Python

License

MIT

Last pushed

Mar 04, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/princeton-nlp/LLM-Shearing"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.