davidsvy/cosformer-pytorch

Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".

/ 100

Emerging

This project provides an efficient way for machine learning researchers and practitioners to experiment with a specific type of attention mechanism in transformer models. It takes input data like text sequences or other sequential information and processes it using a 'linear attention' method that is faster and less computationally intensive than traditional transformer attention. This is for users building custom deep learning models who need to balance performance with computational resources.

No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer building transformer-based models and need to reduce the computational cost of the attention mechanism, especially with longer sequences.

Not ideal if you are looking for a pre-trained model or a high-level API for general natural language processing tasks without needing to customize the attention architecture.

deep-learning-research natural-language-processing sequence-modeling model-optimization attention-mechanisms

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

philipperemy/keras-attention

Keras Attention Layer (Luong and Bahdanau scores).

tatp22/linformer-pytorch

My take on a practical implementation of Linformer for Pytorch.

ematvey/hierarchical-attention-networks

Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...

datalogue/keras-attention

Visualizing RNNs using the attention mechanism

thushv89/attention_keras

Keras Layer implementation of Attention for Sequential models

Explore ML Frameworks

All categories Trending ML Framework directory Insights