CASE-Lab-UMD/LLM-Drop
The official implementation of the paper "Uncovering the Redundancy in Transformers via a Unified Study of Layer Dropping (TMLR)".
This project helps machine learning engineers and researchers optimize large language models (LLMs) by reducing their size and computational demands. It takes an existing Transformer-based LLM, identifies redundant components, and removes them to create a smaller, faster model while maintaining performance. This is ideal for those who deploy or fine-tune LLMs and need to make them more efficient.
189 stars.
Use this if you are a machine learning engineer or researcher looking to make your large language models run faster and use less memory without significantly sacrificing accuracy.
Not ideal if you are an end-user without deep technical expertise in large language model architecture and optimization.
Stars
189
Forks
24
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 06, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CASE-Lab-UMD/LLM-Drop"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
AI-Hypercomputer/maxtext
A simple, performant and scalable Jax LLM!
rasbt/reasoning-from-scratch
Implement a reasoning LLM in PyTorch from scratch, step by step
mindspore-lab/mindnlp
MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless...
mosaicml/llm-foundry
LLM training code for Databricks foundation models
rickiepark/llm-from-scratch
<밑바닥부터 만들면서 공부하는 LLM>(길벗, 2025)의 코드 저장소