visresearch/SDMPrune
The official implementation of "SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language Models"
This project helps machine learning engineers and researchers optimize large language models (LLMs) for efficiency. It takes an existing LLM, like LLaMA3.2-1B, and significantly reduces its size by pruning less critical parts, specifically the MLP modules. The output is a more compact LLM that retains strong performance on various natural language understanding tasks, making it suitable for deployment in resource-constrained environments.
No commits in the last 6 months.
Use this if you need to deploy large language models more efficiently on devices with limited memory or computational power, while maintaining high performance on tasks like question answering or commonsense reasoning.
Not ideal if your primary goal is to enhance the model's performance beyond its original capabilities rather than optimizing its size and efficiency.
Stars
21
Forks
—
Language
Python
License
—
Category
Last pushed
Jun 11, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/visresearch/SDMPrune"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
peremartra/optipfair
Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization...
VainF/Torch-Pruning
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
horseee/LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support...
CASIA-LMC-Lab/FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
princeton-nlp/LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning