horus-ai-labs/DistillFlow

Library for model distillation

/ 100

Emerging

This toolkit helps machine learning engineers and researchers make large language models (LLMs) more efficient and cost-effective. It takes a powerful, but resource-intensive, LLM and a smaller, less capable LLM, along with a dataset, to produce a compact, specialized LLM that performs almost as well but uses fewer computing resources. This is ideal for deploying LLMs in production environments where speed and cost are critical.

165 stars. No commits in the last 6 months.

Use this if you need to deploy a smaller, faster, and more affordable version of a large language model for practical applications like chatbots, content generation, or specialized text analysis.

Not ideal if you primarily need to train brand-new LLMs from scratch or are only looking to fine-tune an existing model without reducing its size and complexity.

large-language-models model-optimization machine-learning-deployment resource-efficiency AI-engineering

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

165

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

scaleapi/llm-engine

Scale LLM Engine public repository

AGI-Arena/MARS

The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models

modelscope/easydistill

a toolkit on knowledge distillation for large language models

AGI-Edgerunners/LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient...

Wang-ML-Lab/bayesian-peft

Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025]

Explore Transformer Models

All categories Trending Transformer directory Insights