facebookresearch/LayerSkip
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
LayerSkip helps developers deploying large language models (LLMs) to make them generate text faster and more efficiently. It takes an existing LayerSkip-trained LLM and generates text like summaries, code, or answers to questions at significantly increased speed. This is for machine learning engineers or researchers working with LLMs who need to improve inference performance without sacrificing accuracy.
361 stars.
Use this if you are a machine learning engineer looking to accelerate the text generation speed of your deployed LLMs while maintaining output quality.
Not ideal if you are primarily interested in classification tasks or True/False questions, as these will not benefit from the speedup.
Stars
361
Forks
36
Language
Python
License
—
Category
Last pushed
Feb 05, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/facebookresearch/LayerSkip"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
FareedKhan-dev/train-llm-from-scratch
A straightforward method for training your LLM, from downloading data to generating text.
kmeng01/rome
Locating and editing factual associations in GPT (NeurIPS 2022)
datawhalechina/llms-from-scratch-cn
仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理
geeks-of-data/knowledge-gpt
Extract knowledge from all information sources using gpt and other language models. Index and...