davidsvy/cosformer-pytorch
Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".
This project provides an efficient way for machine learning researchers and practitioners to experiment with a specific type of attention mechanism in transformer models. It takes input data like text sequences or other sequential information and processes it using a 'linear attention' method that is faster and less computationally intensive than traditional transformer attention. This is for users building custom deep learning models who need to balance performance with computational resources.
No commits in the last 6 months.
Use this if you are a machine learning researcher or engineer building transformer-based models and need to reduce the computational cost of the attention mechanism, especially with longer sequences.
Not ideal if you are looking for a pre-trained model or a high-level API for general natural language processing tasks without needing to customize the attention architecture.
Stars
44
Forks
8
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 29, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/davidsvy/cosformer-pytorch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
philipperemy/keras-attention
Keras Attention Layer (Luong and Bahdanau scores).
tatp22/linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...
datalogue/keras-attention
Visualizing RNNs using the attention mechanism
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models