dashstander/block-recurrent-transformer

Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)

/ 100

Emerging

This is a tool for machine learning researchers and practitioners who are experimenting with advanced neural network architectures. It helps you build and train models that can process sequences of information more efficiently than traditional methods, particularly when dealing with very long inputs. You provide your training data, and it helps you produce a trained recurrent transformer model.

No commits in the last 6 months.

Use this if you are developing or researching new deep learning models for sequence processing and want to explore the Block Recurrent Transformer architecture.

Not ideal if you need an out-of-the-box solution for a specific application without delving into model architecture or training details.

deep-learning-research neural-networks sequence-modeling pytorch-development model-architecture

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

pbloem/former

Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)

NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

kyegomez/SimplifiedTransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections,...

Explore Transformer Models

All categories Trending Transformer directory Insights