JonasGeiping/cramming
Cramming the training of a (BERT-type) language model into limited compute.
This project helps machine learning researchers and practitioners train a BERT-type language model from scratch using minimal computational resources. You input raw text data and configuration settings for the training pipeline, and it outputs a fully trained language model ready for downstream tasks. It's designed for individuals and small teams who want to experiment with language model pretraining without needing access to large-scale GPU clusters.
1,363 stars. No commits in the last 6 months.
Use this if you need to train a custom BERT-style language model on a single consumer GPU within approximately one day.
Not ideal if you have access to extensive computational resources and are focused on achieving state-of-the-art performance by scaling up model size and training time.
Stars
1,363
Forks
103
Language
Python
License
MIT
Category
Last pushed
Jun 13, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/JonasGeiping/cramming"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
facebookresearch/stopes
A library for preparing data for machine translation research (monolingual preprocessing,...
Droidtown/ArticutAPI
API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中文資訊處理的基礎。Articut 不用機器學習,不需資料模型,只用現代白話中文語法規則,即能達到...
rkcosmos/deepcut
A Thai word tokenization library using Deep Neural Network
fukuball/jieba-php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation:...
pytorch/text
Models, data loaders and abstractions for language processing, powered by PyTorch