guotong1988/BERT-pre-training

multi-gpu pre-training in one machine for BERT without horovod (Data Parallelism)

/ 100

Established

This project helps machine learning engineers pre-train large language models like BERT more efficiently on a single machine. By leveraging multiple GPUs, it allows for significantly larger batch sizes, which can accelerate the training process. This is ideal for researchers and ML engineers who need to quickly fine-tune or adapt BERT models for specific language understanding tasks.

171 stars.

Use this if you are an ML engineer with a single powerful server equipped with multiple GPUs and need to pre-train BERT-like models faster for natural language processing applications.

Not ideal if you are looking to distribute pre-training across multiple machines or do not have access to multiple GPUs on a single server.

natural-language-processing large-language-models deep-learning-training computational-linguistics machine-learning-engineering

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

171

Forks

Language

Python

License

Apache-2.0

Related tools

codertimo/BERT-pytorch

Google AI 2018 BERT pytorch implementation

JayYip/m3tl

BERT for Multitask Learning

920232796/bert_seq2seq

pytorch实现 Bert 做seq2seq任务，使用unilm方案,现在也可以做自动摘要，文本分类，情感分析，NER，词性标注等任务,支持t5模型，支持GPT2进行文章续写。

sileod/tasknet

Easy modernBERT fine-tuning and multi-task learning

graykode/toeicbert

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

Explore NLP Tools

All categories Trending NLP directory Insights