hao-ai-lab/Dynasor

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.

/ 100

Emerging

This tool helps developers and ML engineers make their large language models (LLMs) respond faster and more efficiently, especially when those models are performing complex reasoning tasks. It takes an existing LLM setup, such as one running on vLLM, and optimizes its inference speed without needing any retraining. The output is a significantly faster and more resource-efficient LLM, which is particularly useful for applications requiring quick, thought-out responses.

224 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or developer looking to accelerate the performance of your LLM applications, especially those requiring complex reasoning, without investing time in model retraining or fine-tuning.

Not ideal if you need to fundamentally change an LLM's behavior or knowledge through training, rather than just optimizing its inference speed.

LLM inference optimization Reasoning acceleration Model deployment AI efficiency ML operations

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

224

Forks

Language

Python

License

MIT

Higher-rated alternatives

cvs-health/uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM...

PRIME-RL/TTRL

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

sapientinc/HRM

Hierarchical Reasoning Model Official Release

tigerchen52/query_level_uncertainty

query-level uncertainty in LLMs

reasoning-survey/Awesome-Reasoning-Foundation-Models

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

Explore Transformer Models

All categories Trending Transformer directory Insights