ruimalheiro/training-custom-llama

Llama-style transformer in PyTorch with multi-node / multi-GPU training. Includes pretraining, fine-tuning, DPO, LoRA, and knowledge distillation. Scripts for dataset mixing and training from scratch.

/ 100

Emerging

This project helps machine learning engineers and researchers train custom large language models (LLMs) from scratch or fine-tune existing ones. You can input various text datasets and configuration settings, then output a specialized LLM ready for deployment. It's designed for individuals and teams working with large-scale NLP tasks and requiring high-performance computing.

Use this if you need to build or adapt a Llama-style language model using your own datasets and require multi-node or multi-GPU training capabilities.

Not ideal if you're looking for a low-code solution to apply pre-trained models without needing to customize architecture, training methods, or manage distributed computing infrastructure.

Large Language Models NLP Model Training Distributed Machine Learning AI Research Custom Model Development

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

unslothai/unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama,...

huggingface/peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5,...

oumi-ai/oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Explore Transformer Models

All categories Trending Transformer directory Insights