PaddleJitLab/CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

/ 100

Established

This tutorial helps GPU programmers enhance the performance of their applications by teaching advanced CUDA programming techniques. It takes you from setting up a development environment to optimizing complex algorithms like matrix multiplication and convolution. GPU programmers, especially those working with high-performance computing or large language models, would find this project useful to make their code run faster and more efficiently.

911 stars.

Use this if you are a programmer looking to improve the speed and efficiency of your GPU-accelerated applications using CUDA, Triton, or optimizing large language model (LLM) inference.

Not ideal if you are looking for a general introduction to programming or are not working with GPU hardware and accelerated computing.

GPU programming High-performance computing CUDA optimization Parallel computing LLM inference

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

911

Forks

Language

JavaScript

License

Apache-2.0

Related frameworks

iree-org/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

brucefan1983/GPUMD

Graphics Processing Units Molecular Dynamics

uxlfoundation/oneDAL

oneAPI Data Analytics Library (oneDAL)

rapidsai/cuml

cuML - RAPIDS Machine Learning Library

NVIDIA/cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

Explore ML Frameworks

All categories Trending ML Framework directory Insights