Ther-nullptr/Awesome-Transformer-Accleration

Paper list for accleration of transformers

23
/ 100
Experimental

This is a curated list of research papers and implementations focused on making large language models (LLMs) and other transformer-based AI models run faster and more efficiently. It compiles methods like quantization and pruning, which reduce the computational resources needed, along with system and hardware improvements. The ideal user is an AI/ML engineer or researcher working on deploying or optimizing these complex models for practical applications.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking for state-of-the-art techniques to optimize the performance, speed, or memory footprint of transformer models like Vision Transformers, BERT, and GPT for real-world deployment.

Not ideal if you are looking for an off-the-shelf, ready-to-use software library to simply apply acceleration without diving into academic research or technical implementations.

AI model optimization deep learning efficiency natural language processing computer vision transformer architecture
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

14

Forks

2

Language

License

Last pushed

Jul 01, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Ther-nullptr/Awesome-Transformer-Accleration"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.