evanatyourservice/llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
This project helps machine learning engineers pretrain small language models (LLMs) from scratch using the FineWeb-Edu dataset. It takes raw text data and configuration settings for model architecture and optimizers, outputting trained model checkpoints. The primary users are ML engineers or researchers focused on developing custom LLMs with high performance and efficiency.
No commits in the last 6 months.
Use this if you are an ML engineer looking to pretrain a SmolLM-style language model on a large text dataset using JAX/Flax, with a focus on exploring advanced optimizers like PSGD Kron for improved efficiency.
Not ideal if you need to perform inference, fine-tune an existing model, or prefer to work in PyTorch or other frameworks, as it's currently set up only for pretraining in JAX/Flax.
Stars
18
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jul 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/evanatyourservice/llm-jax"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
google/paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...
JosefAlbers/PVM
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
iamarunbrahma/finetuned-qlora-falcon7b-medical
Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset
h2oai/h2o-wizardlm
Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning