JonasGeiping/cramming

Cramming the training of a (BERT-type) language model into limited compute.

44
/ 100
Emerging

This project helps machine learning researchers and practitioners train a BERT-type language model from scratch using minimal computational resources. You input raw text data and configuration settings for the training pipeline, and it outputs a fully trained language model ready for downstream tasks. It's designed for individuals and small teams who want to experiment with language model pretraining without needing access to large-scale GPU clusters.

1,363 stars. No commits in the last 6 months.

Use this if you need to train a custom BERT-style language model on a single consumer GPU within approximately one day.

Not ideal if you have access to extensive computational resources and are focused on achieving state-of-the-art performance by scaling up model size and training time.

natural-language-processing machine-learning-research language-model-training resource-constrained-ml deep-learning-experimentation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

1,363

Forks

103

Language

Python

License

MIT

Last pushed

Jun 13, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/JonasGeiping/cramming"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.