evanatyourservice/llm-jax

Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.

/ 100

Experimental

This project helps machine learning engineers pretrain small language models (LLMs) from scratch using the FineWeb-Edu dataset. It takes raw text data and configuration settings for model architecture and optimizers, outputting trained model checkpoints. The primary users are ML engineers or researchers focused on developing custom LLMs with high performance and efficiency.

No commits in the last 6 months.

Use this if you are an ML engineer looking to pretrain a SmolLM-style language model on a large text dataset using JAX/Flax, with a focus on exploring advanced optimizers like PSGD Kron for improved efficiency.

Not ideal if you need to perform inference, fine-tune an existing model, or prefer to work in PyTorch or other frameworks, as it's currently set up only for pretraining in JAX/Flax.

language-model-pretraining machine-learning-engineering neural-network-training deep-learning-research natural-language-processing

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

axolotl-ai-cloud/axolotl

Go ahead and axolotl questions

google/paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...

JosefAlbers/PVM

Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon

iamarunbrahma/finetuned-qlora-falcon7b-medical

Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset

h2oai/h2o-wizardlm

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning

Explore LLM Tools

All categories Trending LLM Tool directory Insights