HySonLab/HierAttention

Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions

/ 100

Experimental

This project helps machine learning researchers who are developing models that process very long sequences of data, such as extensive text documents or complex biological sequences. It provides a method to efficiently identify important relationships across distant parts of the sequence, overcoming computational limitations of standard attention mechanisms. The input is long sequential data, and the output is a more computationally efficient and performant attention model.

No commits in the last 6 months.

Use this if you are a machine learning researcher working on models for very long sequences and need a more efficient way to capture long-range dependencies.

Not ideal if you are not a machine learning researcher or your primary interest is in applying existing, off-the-shelf models to short sequences.

deep-learning sequence-modeling attention-mechanisms computational-efficiency model-architecture

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

lucidrains/x-transformers

A concise but complete full-attention transformer with a set of promising experimental features...

kanishkamisra/minicons

Utility for behavioral and representational analyses of Language Models

lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

lucidrains/dreamer4

Implementation of Danijar's latest iteration for his Dreamer line of work

Nicolepcx/Transformers-in-Action

This is the corresponding code for the book Transformers in Action

Explore Transformer Models

All categories Trending Transformer directory Insights