ROCm/hipBLASLt

[DEPRECATED] Moved to ROCm/rocm-libraries repo

/ 100

Established

This library enables high-performance general matrix-matrix operations for specialized AMD hardware. It takes in matrices, scalars, and optional activation functions, then outputs the results of complex matrix multiplications crucial for AI and scientific computing. It's designed for system engineers or deep learning practitioners who optimize underlying mathematical computations on ROCm-enabled GPUs.

113 stars.

Use this if you need to perform highly optimized, flexible matrix-matrix multiplications on AMD ROCm-enabled GPUs (gfx90a, gfx94x, gfx110x) and require fine-grained control over data layouts, input/compute types, and activation functions.

Not ideal if you are a high-level developer working with frameworks like PyTorch or TensorFlow, as this library operates at a much lower, hardware-specific level.

GPU-programming deep-learning-inference high-performance-computing AI-acceleration numerical-optimization

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

113

Forks

147

Language

Assembly

License

MIT

Related frameworks

brucefan1983/GPUMD

Graphics Processing Units Molecular Dynamics

iree-org/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

uxlfoundation/oneDAL

oneAPI Data Analytics Library (oneDAL)

rapidsai/cuml

cuML - RAPIDS Machine Learning Library

NVIDIA/cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

Explore ML Frameworks

All categories Trending ML Framework directory Insights