JonasGeiping/cramming

Cramming the training of a (BERT-type) language model into limited compute.

/ 100

Emerging

This project helps machine learning researchers and practitioners train a BERT-type language model from scratch using minimal computational resources. You input raw text data and configuration settings for the training pipeline, and it outputs a fully trained language model ready for downstream tasks. It's designed for individuals and small teams who want to experiment with language model pretraining without needing access to large-scale GPU clusters.

1,363 stars. No commits in the last 6 months.

Use this if you need to train a custom BERT-style language model on a single consumer GPU within approximately one day.

Not ideal if you have access to extensive computational resources and are focused on achieving state-of-the-art performance by scaling up model size and training time.

natural-language-processing machine-learning-research language-model-training resource-constrained-ml deep-learning-experimentation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

1,363

Forks

103

Language

Python

License

MIT

Higher-rated alternatives

facebookresearch/stopes

A library for preparing data for machine translation research (monolingual preprocessing,...

Droidtown/ArticutAPI

API of Articut 中文斷詞 (兼具語意詞性標記)：「斷詞」又稱「分詞」，是中文資訊處理的基礎。Articut 不用機器學習，不需資料模型，只用現代白話中文語法規則，即能達到...

rkcosmos/deepcut

A Thai word tokenization library using Deep Neural Network

fukuball/jieba-php

"結巴"中文分詞：做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation:...

pytorch/text

Models, data loaders and abstractions for language processing, powered by PyTorch

Explore NLP Tools

All categories Trending NLP directory Insights