SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
EAGLE helps practitioners accelerate how quickly their large language models (LLMs) generate text, such as responses in a chatbot or content for marketing. It takes an existing LLM and outputs a much faster version that produces the exact same text. This is designed for engineers or ML specialists deploying and managing LLMs.
2,213 stars.
Use this if you need to significantly speed up the text generation of your large language models while maintaining the quality and consistency of the output, especially on less powerful GPUs.
Not ideal if you are looking for a solution to improve the accuracy or factual correctness of your LLM's responses, as this tool focuses solely on inference speed.
Stars
2,213
Forks
260
Language
Python
License
—
Category
Last pushed
Feb 20, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/SafeAILab/EAGLE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
sgl-project/SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
structuredllm/syncode
Efficient and general syntactical decoding for Large Language Models
romsto/Speculative-Decoding
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan...
hao-ai-lab/JacobiForcing
Jacobi Forcing: Fast and Accurate Diffusion-style Decoding
kssteven418/BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder