evanatyourservice/llm-jax

Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.

29
/ 100
Experimental

This project helps machine learning engineers pretrain small language models (LLMs) from scratch using the FineWeb-Edu dataset. It takes raw text data and configuration settings for model architecture and optimizers, outputting trained model checkpoints. The primary users are ML engineers or researchers focused on developing custom LLMs with high performance and efficiency.

No commits in the last 6 months.

Use this if you are an ML engineer looking to pretrain a SmolLM-style language model on a large text dataset using JAX/Flax, with a focus on exploring advanced optimizers like PSGD Kron for improved efficiency.

Not ideal if you need to perform inference, fine-tune an existing model, or prefer to work in PyTorch or other frameworks, as it's currently set up only for pretraining in JAX/Flax.

language-model-pretraining machine-learning-engineering neural-network-training deep-learning-research natural-language-processing
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 5 / 25

How are scores calculated?

Stars

18

Forks

1

Language

Python

License

MIT

Category

llm-fine-tuning

Last pushed

Jul 24, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/evanatyourservice/llm-jax"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.