declare-lab/della

DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling

/ 100

Experimental

This tool helps AI engineers combine several specialized large language models (LLMs) into a single, more versatile model without needing extensive new training. You input your existing, fine-tuned LLMs that excel in specific areas (like math, coding, or general instructions), and it outputs a new, merged LLM capable of handling multiple tasks effectively. This is for machine learning practitioners or researchers who manage and deploy LLMs.

No commits in the last 6 months.

Use this if you need to consolidate multiple domain-specific large language models into one efficient model to reduce deployment costs or improve multi-task performance.

Not ideal if you're looking for a tool to train a large language model from scratch or to fine-tune a model on a completely new dataset.

large-language-models model-optimization multi-task-learning AI-model-deployment machine-learning-engineering

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

pengzhangzhi/Open-dLLM

Open diffusion language model for code generation — releasing pretraining, evaluation,...

EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM...

THUDM/LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

AIoT-MLSys-Lab/SVD-LLM

[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

Explore Transformer Models

All categories Trending Transformer directory Insights