openmlsys/openmlsys-cuda

Tutorials for writing high-performance GPU operators in AI frameworks.

32
/ 100
Emerging

This project provides practical tutorials and examples for engineers aiming to optimize the performance of AI model operations on NVIDIA GPUs. It demonstrates how to write high-performance GPU code, taking basic operator implementations and applying advanced optimization techniques like shared memory usage and pipeline rearrangement. The target audience is AI/ML engineers and researchers who develop and deploy machine learning models and need to accelerate their computational graphs.

134 stars. No commits in the last 6 months.

Use this if you are an AI/ML engineer or researcher working with NVIDIA GPUs and need to understand or implement highly optimized custom operators for your models.

Not ideal if you are a data scientist or user who primarily uses existing AI frameworks and libraries without needing to dive into low-level GPU programming.

GPU-acceleration AI-model-optimization deep-learning-inference machine-learning-engineering CUDA-programming
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 14 / 25

How are scores calculated?

Stars

134

Forks

15

Language

Cuda

License

Last pushed

Aug 12, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/openmlsys/openmlsys-cuda"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.