ROCm/hipBLASLt
[DEPRECATED] Moved to ROCm/rocm-libraries repo
This library enables high-performance general matrix-matrix operations for specialized AMD hardware. It takes in matrices, scalars, and optional activation functions, then outputs the results of complex matrix multiplications crucial for AI and scientific computing. It's designed for system engineers or deep learning practitioners who optimize underlying mathematical computations on ROCm-enabled GPUs.
113 stars.
Use this if you need to perform highly optimized, flexible matrix-matrix multiplications on AMD ROCm-enabled GPUs (gfx90a, gfx94x, gfx110x) and require fine-grained control over data layouts, input/compute types, and activation functions.
Not ideal if you are a high-level developer working with frameworks like PyTorch or TensorFlow, as this library operates at a much lower, hardware-specific level.
Stars
113
Forks
147
Language
Assembly
License
MIT
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ROCm/hipBLASLt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
brucefan1983/GPUMD
Graphics Processing Units Molecular Dynamics
iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.
uxlfoundation/oneDAL
oneAPI Data Analytics Library (oneDAL)
rapidsai/cuml
cuML - RAPIDS Machine Learning Library
NVIDIA/cutlass
CUDA Templates and Python DSLs for High-Performance Linear Algebra