PaddleJitLab/CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
This tutorial helps GPU programmers enhance the performance of their applications by teaching advanced CUDA programming techniques. It takes you from setting up a development environment to optimizing complex algorithms like matrix multiplication and convolution. GPU programmers, especially those working with high-performance computing or large language models, would find this project useful to make their code run faster and more efficiently.
911 stars.
Use this if you are a programmer looking to improve the speed and efficiency of your GPU-accelerated applications using CUDA, Triton, or optimizing large language model (LLM) inference.
Not ideal if you are looking for a general introduction to programming or are not working with GPU hardware and accelerated computing.
Stars
911
Forks
91
Language
JavaScript
License
Apache-2.0
Category
Last pushed
Jan 14, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/PaddleJitLab/CUDATutorial"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
brucefan1983/GPUMD
Graphics Processing Units Molecular Dynamics
uxlfoundation/oneDAL
oneAPI Data Analytics Library (oneDAL)
rapidsai/cuml
cuML - RAPIDS Machine Learning Library
NVIDIA/cutlass
CUDA Templates and Python DSLs for High-Performance Linear Algebra