NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

73
/ 100
Verified

This library helps AI engineers and researchers create and train large language models (LLMs) more efficiently. It takes your existing Transformer model code and, using NVIDIA GPUs, allows you to train and run inference much faster while using less memory. This is especially useful for those working with massive datasets and complex AI models.

3,206 stars. Actively maintained with 57 commits in the last 30 days.

Use this if you are developing or training large Transformer-based AI models and want to significantly speed up your workflows and reduce memory usage on NVIDIA GPUs.

Not ideal if you are not working with Transformer models or do not have access to NVIDIA GPUs, especially newer generations like Hopper, Ada, or Blackwell.

large-language-models ai-model-training deep-learning-optimization generative-ai
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

3,206

Forks

659

Language

Python

License

Apache-2.0

Last pushed

Mar 12, 2026

Commits (30d)

57

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/NVIDIA/TransformerEngine"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.