NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

/ 100

Emerging

This tool helps AI developers and researchers speed up the process of running large language models like BERT, GPT, and other transformer-based networks. It takes your existing model configurations and optimizes how they run on NVIDIA GPUs, delivering significantly faster inference times for applications like natural language processing or image recognition. The primary users are those building and deploying AI models who need to maximize computational efficiency.

6,398 stars. No commits in the last 6 months.

Use this if you are a developer or researcher working with transformer-based AI models and need to accelerate their inference performance on NVIDIA GPUs.

Not ideal if you are looking for a solution that does not involve direct integration into AI frameworks or if you are not working with NVIDIA GPUs.

AI model deployment NLP inference Deep learning optimization Machine learning engineering GPU acceleration

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

6,398

Forks

930

Language

C++

License

Apache-2.0

Higher-rated alternatives

huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

pbloem/former

Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)

kyegomez/SimplifiedTransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections,...

ARM-software/keyword-transformer

Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769

Explore Transformer Models

All categories Trending Transformer directory Insights