facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

52
/ 100
Established

LayerSkip helps developers deploying large language models (LLMs) to make them generate text faster and more efficiently. It takes an existing LayerSkip-trained LLM and generates text like summaries, code, or answers to questions at significantly increased speed. This is for machine learning engineers or researchers working with LLMs who need to improve inference performance without sacrificing accuracy.

361 stars.

Use this if you are a machine learning engineer looking to accelerate the text generation speed of your deployed LLMs while maintaining output quality.

Not ideal if you are primarily interested in classification tasks or True/False questions, as these will not benefit from the speedup.

LLM deployment NLP inference model optimization text generation machine learning engineering
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

361

Forks

36

Language

Python

License

Last pushed

Feb 05, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/facebookresearch/LayerSkip"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.