facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

/ 100

Established

LayerSkip helps developers deploying large language models (LLMs) to make them generate text faster and more efficiently. It takes an existing LayerSkip-trained LLM and generates text like summaries, code, or answers to questions at significantly increased speed. This is for machine learning engineers or researchers working with LLMs who need to improve inference performance without sacrificing accuracy.

361 stars.

Use this if you are a machine learning engineer looking to accelerate the text generation speed of your deployed LLMs while maintaining output quality.

Not ideal if you are primarily interested in classification tasks or True/False questions, as these will not benefit from the speedup.

LLM deployment NLP inference model optimization text generation machine learning engineering

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

361

Forks

Language

Python

License

—

Related models

rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

datawhalechina/llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

geeks-of-data/knowledge-gpt

Extract knowledge from all information sources using gpt and other language models. Index and...

Explore Transformer Models

All categories Trending Transformer directory Insights