Guitaricet/relora

Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

/ 100

Emerging

This tool helps machine learning engineers efficiently pre-train large language models. It takes preprocessed text data and a base model checkpoint, then trains the model using a technique that keeps memory usage low, producing a fine-tuned model ready for downstream tasks. It's designed for ML engineers working on custom language models.

474 stars. No commits in the last 6 months.

Use this if you need to pre-train a large language model with limited GPU memory but still want to achieve high-rank training performance.

Not ideal if you are looking for an out-of-the-box solution for inference or fine-tuning without needing to manage complex training configurations.

large-language-models model-pretraining deep-learning-optimization natural-language-processing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

474

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

unslothai/unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama,...

huggingface/peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5,...

oumi-ai/oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Explore Transformer Models

All categories Trending Transformer directory Insights