HKUNLP/efficient-attention

[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling

/ 100

Emerging

This project provides advanced attention mechanisms (LARA and EVA) to make image classification and machine translation models run faster and more efficiently. It takes in standard image datasets like ImageNet or text data for translation tasks, and outputs more optimized models. This is for machine learning engineers and researchers who are building and training advanced AI models for vision and language applications.

No commits in the last 6 months.

Use this if you are a machine learning engineer working with vision transformers or language models and need to improve the efficiency and speed of your attention mechanisms during model training and inference.

Not ideal if you are an end-user without a strong background in deep learning model development or if you need a plug-and-play solution for basic image or text processing without delving into model architecture.

image-classification machine-translation language-modeling deep-learning-optimization vision-transformers

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

microsoft/LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

jadore801120/attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

bhavnicksm/vanilla-transformer-jax

JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....

kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with...

AbdelStark/attnres

Rust implementation of Attention Residuals from MoonshotAI/Kimi

Explore Transformer Models

All categories Trending Transformer directory Insights