NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

48
/ 100
Emerging

This tool helps AI developers and researchers speed up the process of running large language models like BERT, GPT, and other transformer-based networks. It takes your existing model configurations and optimizes how they run on NVIDIA GPUs, delivering significantly faster inference times for applications like natural language processing or image recognition. The primary users are those building and deploying AI models who need to maximize computational efficiency.

6,398 stars. No commits in the last 6 months.

Use this if you are a developer or researcher working with transformer-based AI models and need to accelerate their inference performance on NVIDIA GPUs.

Not ideal if you are looking for a solution that does not involve direct integration into AI frameworks or if you are not working with NVIDIA GPUs.

AI model deployment NLP inference Deep learning optimization Machine learning engineering GPU acceleration
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

6,398

Forks

930

Language

C++

License

Apache-2.0

Last pushed

Mar 27, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/NVIDIA/FasterTransformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.