Cre4T3Tiv3/jetson-orin-matmul-analysis

Scientific CUDA benchmarking framework: 4 implementations x 3 power modes x 5 matrix sizes on Jetson Orin Nano. 1,282 GFLOPS peak, 90% performance @ 88% power (25W mode), 99.5% accuracy validation, edge AI deployment guide.

22
/ 100
Experimental

This framework helps embedded systems engineers and AI developers understand the real-world performance of matrix multiplication on NVIDIA Jetson Orin Nano devices. It takes four different CUDA implementations of matrix multiplication and runs them across various power modes and matrix sizes. The output is a detailed benchmark report and visualizations showing performance, power efficiency, and accuracy, helping users optimize their edge AI applications.

No commits in the last 6 months.

Use this if you need to choose the best matrix multiplication implementation and power configuration for your neural network or linear algebra workloads on a Jetson Orin Nano.

Not ideal if you are looking to benchmark general-purpose computing tasks or optimize for GPUs other than the Jetson Orin Nano.

edge-ai embedded-systems deep-learning-optimization performance-engineering device-benchmarking
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 5 / 25
Maturity 15 / 25
Community 0 / 25

How are scores calculated?

Stars

14

Forks

Language

Python

License

MIT

Last pushed

Oct 14, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Cre4T3Tiv3/jetson-orin-matmul-analysis"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.