Guitaricet/relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
This tool helps machine learning engineers efficiently pre-train large language models. It takes preprocessed text data and a base model checkpoint, then trains the model using a technique that keeps memory usage low, producing a fine-tuned model ready for downstream tasks. It's designed for ML engineers working on custom language models.
474 stars. No commits in the last 6 months.
Use this if you need to pre-train a large language model with limited GPU memory but still want to achieve high-rank training performance.
Not ideal if you are looking for an out-of-the-box solution for inference or fine-tuning without needing to manage complex training configurations.
Stars
474
Forks
42
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Apr 21, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Guitaricet/relora"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
unslothai/unsloth
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama,...
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
modelscope/ms-swift
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5,...
oumi-ai/oumi
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training