hao-ai-lab/DistCA
Efficient Long-context Language Model Training by Core Attention Disaggregation
This system helps AI researchers and deep learning engineers train large language models (LLMs) more efficiently, especially when dealing with very long input texts. It takes your LLM training setup and data, and produces a faster, more scalable training process, allowing you to build more capable models without excessive hardware or time. It is designed for those pushing the boundaries of what LLMs can understand.
Use this if you are training large language models with extremely long input contexts and are struggling with slow training times, workload imbalances across GPUs, or high communication overhead.
Not ideal if you are working with shorter context lengths or do not require highly distributed training across many GPUs, as the overhead of this system may not provide significant benefits.
Stars
93
Forks
7
Language
Python
License
—
Category
Last pushed
Mar 05, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/hao-ai-lab/DistCA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ZHZisZZ/dllm
dLLM: Simple Diffusion Language Modeling
pengzhangzhi/Open-dLLM
Open diffusion language model for code generation — releasing pretraining, evaluation,...
EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM...
THUDM/LongWriter
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
AIoT-MLSys-Lab/SVD-LLM
[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2