sgl-project/SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

79
/ 100
Verified

For those working with large language models, SpecForge helps you train specialized 'speculative decoding' models that can significantly speed up how fast your main LLM responds. You feed in your LLM and it outputs a more efficient version, ready to be used with the SGLang serving framework. This is for AI practitioners and researchers looking to optimize LLM inference performance.

729 stars. Actively maintained with 27 commits in the last 30 days. Available on PyPI.

Use this if you are a machine learning engineer or researcher looking to accelerate the inference speed of your large language models by training and deploying specialized speculative decoding models.

Not ideal if you're not already working with large language models or are not familiar with model training and deployment concepts.

LLM-optimization AI-inference model-training machine-learning-engineering large-language-models
No Dependents
Maintenance 20 / 25
Adoption 10 / 25
Maturity 24 / 25
Community 25 / 25

How are scores calculated?

Stars

729

Forks

179

Language

Python

License

MIT

Last pushed

Mar 11, 2026

Commits (30d)

27

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sgl-project/SpecForge"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.