helpmefindaname/transformer-smaller-training-vocab

Temporary remove unused tokens during training to save ram and speed.

/ 100

Emerging

This tool helps machine learning engineers or researchers who are fine-tuning large language models to save memory and speed up training. It takes your pre-trained transformer model and training dataset, and temporarily reduces the model's vocabulary to only include tokens present in your data. This results in faster training and lower GPU memory usage, while still allowing you to save the full model afterward.

Used by 1 other package. No commits in the last 6 months. Available on PyPI.

Use this if you are training a transformer model and notice that many tokens in the model's full vocabulary are not actually used in your specific training data, causing unnecessary memory consumption and slower training.

Not ideal if you are using 'slow' tokenizers other than XLMRobertaTokenizer, RobertaTokenizer, or BertTokenizer, as support is limited for these.

natural-language-processing machine-learning-engineering deep-learning-optimization transformer-training large-language-models

Stale 6m

Maintenance 2 / 25

Adoption 7 / 25

Maturity 25 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

lucidrains/x-transformers

A concise but complete full-attention transformer with a set of promising experimental features...

kanishkamisra/minicons

Utility for behavioral and representational analyses of Language Models

lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

lucidrains/dreamer4

Implementation of Danijar's latest iteration for his Dreamer line of work

Nicolepcx/Transformers-in-Action

This is the corresponding code for the book Transformers in Action

Explore Transformer Models

All categories Trending Transformer directory Insights