suvojit-0x55aa/mixed-precision-pytorch

Training with FP16 weights in PyTorch

/ 100

Emerging

This project helps machine learning engineers train deep learning models faster and more efficiently using NVIDIA GPUs. By training models with half-precision (FP16) weights, it reduces the memory footprint of models and significantly decreases training time. This is beneficial for anyone building and iterating on large-scale deep learning models who needs to optimize their training workflow.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking to speed up the training of your PyTorch models on NVIDIA GPUs while reducing memory consumption.

Not ideal if your deep learning models are small, you are not using an NVIDIA GPU, or you require absolute maximal numerical precision at all times.

deep-learning-training model-optimization GPU-acceleration machine-learning-engineering neural-network-training

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

WTFPL

Higher-rated alternatives

triton-inference-server/server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

gpu-mode/Triton-Puzzles

Puzzles for learning Triton

hailo-ai/hailo_model_zoo

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

open-mmlab/mmdeploy

OpenMMLab Model Deployment Framework

hyperai/tvm-cn

TVM Documentation in Chinese Simplified / TVM 中文文档

Explore ML Frameworks

All categories Trending ML Framework directory Insights